Setting Up LiteLLM with Azure Databricks Serving Endpoints for Claude Code


As cloud AI offerings expand, we increasingly need to work with models hosted across different platforms. For example, Azure Databricks offers Claude models through their serving endpoints.

If you’ve ever wanted the flexibility of Claude Code’s terminal while using your organisation’s Azure Databricks infrastructure, you’re not alone.

This guide walks you through configuring Claude Code to work with Claude models hosted on Azure Databricks, using LiteLLM as a universal API proxy.

Table of Contents

LiteLLM: The Universal Translator for AI APIs

LiteLLM acts as a proxy server that translates between different AI model APIs, providing a consistent interface regardless of where the model is actually hosted. Think of it as a universal adapter for your AI needs – plug in any model, interact with it through a standardised interface.

The Setup: A Step-by-Step Guide

Let me walk you through how I’ve set up LiteLLM to work with Claude models hosted on Azure Databricks serving endpoints to be used by Claude Code

1. Azure Databricks setup

  • I will not go through how to create an Azure Databricks instance as its very straight forward – I recommend this article on creating (Ensure you select Premium SKU as part of the setup

Log into your Azure Databricks instance (example URL: https://adb-1234.azuredatabricks.net/) and:

  • Create a developer access token
    • Go to your user profile (top right) → Settings → User → Developer section → Access Tokens → Generate new token
  • Service Endpoints URL
    • Note your endpoint, which will look like:
      https://your-databricks-instance.azuredatabricks.net/serving-endpoints

2. Setting up and configuring LiteLLM

Configuration file

I have created the below all in the sample folder litellm

The heart of our setup is the config.yaml configuration file that tells LiteLLM which models to expose and how to connect to them:

model_list:
  - model_name: databricks-claude-sonnet-4
    litellm_params:
      model: databricks/databricks-claude-sonnet-4
      api_key: os.environ/DATABRICKS_API_KEY
      api_base: os.environ/DATABRICKS_API_BASE
  - model_name: databricks-claude-3-7-sonnet
    litellm_params:
      model: databricks/databricks-claude-3-7-sonnet
      api_key: os.environ/DATABRICKS_API_KEY
      api_base: os.environ/DATABRICKS_API_BASE

With my example, i am going to make use of two modes from my Azure databricks instance: databricks-claude-sonnet-4 & databricks/databricks-claude-3-7-sonnet

This configuration file config.yaml above maps user-friendly model names to their actual Databricks endpoints, using environment variables for sensitive information.

Environment Setup

We store our Azure Databricks credentials in a .env file:

DATABRICKS_API_KEY=your-databricks-api-key
DATABRICKS_API_BASE=https://your-databricks-instance.azuredatabricks.net/serving-endpoints/

3. Dockerised Deployment

To keep things clean and portable, I’ve containerised the LiteLLM proxy using Docker:

# Build the image from the BerriAI/litellm repository
docker build -t litellm-local https://github.com/BerriAI/litellm.git

# Run the container  
docker run -d \
    --name litellm-container \
    -p 4000:4000 \
    -v $(pwd)/config.yaml:/app/config.yaml \
    --env-file .env \
    litellm-local \
    --config /app/config.yaml --port 4000

This script builds a Docker image from the LiteLLM GitHub repository, then runs it with our configuration, exposing it on port 4000.

4. Testing the LiteLLM Proxy

You can now interact with your Azure Databricks-hosted Claude models as if they were Anthropic endpoints:

curl -X POST http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "databricks-claude-3-7-sonnet",
        "messages": [{"role": "user", "content": "Hello, Claude!"}]
      }'

If successful, you will see a response similar to below:

{"id":"msg_bdrk_01CHLqspDTBWBawnNBE7ZQMz","created":1752091938,"model":"us.anthropic.claude-3-7-sonnet-20250219-v1:0","object":"chat.completion","system_fingerprint":null,"choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! It's nice to meet you. How can I help you today? I'm ready to assist with information, answer questions, or have a conversation about topics you're interested in.","role":"assistant","tool_calls":null,"function_call":null}}],"usage":{"completion_tokens":41,"prompt_tokens":11,"total_tokens":52,"completion_tokens_details":null,"prompt_tokens_details":null}}%

The beauty of this approach is that the authorization token can be anything you want for local development – here we’re using a placeholder sk-1234.

5. Integration with Claude Code CLI

For those using Anthropic’s powerful Claude Code CLI, you can point it to your local proxy by setting these environment variables & then running Claude Code (ANTHROPIC_MODEL can be any of the models you have referenced in config.yaml above):

export ANTHROPIC_BASE_URL="http://localhost:4000" 
export ANTHROPIC_API_KEY="sk-1234" 
export ANTHROPIC_MODEL="databricks-claude-3-7-sonnet"

claude

If working correctlly, you will see similar to below when you run claude

Screenshot showing Claude environment variables highlighted

This configuration allows Claude Code to seamlessly use the Databricks-hosted Claude models through your local proxy. The Claude Code CLI is an interactive command-line tool that helps with software engineering tasks, making it perfect for developers who prefer terminal-based workflows.

Wrapping up

Setting up LiteLLM as a proxy for Azure Databricks-hosted Claude models has been a significant productivity boost for my AI development workflow. The standardised interface means less time wrestling with different APIs and more time focusing on what matters – building great AI-powered features. The ability to use Claude Code CLI with Databricks-hosted models has been particularly valuable for my terminal-centric development process.

Have you tried using LiteLLM or similar proxy tools with Claude Code in your AI development? I’d love to hear about your experiences

A mixture of Claude Code & GitHub Copilot is truely an awesome experience!

GitHub repository containing the above setup


Share this content:

I am a passionate blogger with extensive experience in web design. As a seasoned YouTube SEO expert, I have helped numerous creators optimize their content for maximum visibility.

Leave a Comment