Setting Up LiteLLM with Azure Databricks Serving Endpoints for Claude Code

[ad_1]

As cloud AI offerings expand, we increasingly need to work with models hosted across different platforms. For example, Azure Databricks offers Claude models through their serving endpoints.

If you’ve ever wanted the flexibility of Claude Code’s terminal while using your organisation’s Azure Databricks infrastructure, you’re not alone.

This guide walks you through configuring Claude Code to work with Claude models hosted on Azure Databricks, using LiteLLM as a universal API proxy.

Table of Contents

LiteLLM: The Universal Translator for AI APIs

LiteLLM acts as a proxy server that translates between different AI model APIs, providing a consistent interface regardless of where the model is actually hosted. Think of it as a universal adapter for your AI needs – plug in any model, interact with it through a standardised interface.

The Setup: A Step-by-Step Guide

Let me walk you through how I’ve set up LiteLLM to work with Claude models hosted on Azure Databricks serving endpoints to be used by Claude Code

1. Azure Databricks setup

I will not go through how to create an Azure Databricks instance as its very straight forward – I recommend this article on creating (Ensure you select Premium SKU as part of the setup

Log into your Azure Databricks instance (example URL: https://adb-1234.azuredatabricks.net/) and:

Create a developer access token
- Go to your user profile (top right) → Settings → User → Developer section → Access Tokens → Generate new token
Service Endpoints URL
- Note your endpoint, which will look like:
  https://your-databricks-instance.azuredatabricks.net/serving-endpoints

2. Setting up and configuring LiteLLM

Configuration file

I have created the below all in the sample folder litellm

The heart of our setup is the config.yaml configuration file that tells LiteLLM which models to expose and how to connect to them:

model_list:
  - model_name: databricks-claude-sonnet-4
    litellm_params:
      model: databricks/databricks-claude-sonnet-4
      api_key: os.environ/DATABRICKS_API_KEY
      api_base: os.environ/DATABRICKS_API_BASE
  - model_name: databricks-claude-3-7-sonnet
    litellm_params:
      model: databricks/databricks-claude-3-7-sonnet
      api_key: os.environ/DATABRICKS_API_KEY
      api_base: os.environ/DATABRICKS_API_BASE

With my example, i am going to make use of two modes from my Azure databricks instance: databricks-claude-sonnet-4 & databricks/databricks-claude-3-7-sonnet

This configuration file config.yaml above maps user-friendly model names to their actual Databricks endpoints, using environment variables for sensitive information.

Environment Setup

We store our Azure Databricks credentials in a .env file:

DATABRICKS_API_KEY=your-databricks-api-key
DATABRICKS_API_BASE=https://your-databricks-instance.azuredatabricks.net/serving-endpoints/

3. Dockerised Deployment

To keep things clean and portable, I’ve containerised the LiteLLM proxy using Docker:

# Build the image from the BerriAI/litellm repository
docker build -t litellm-local https://github.com/BerriAI/litellm.git

# Run the container  
docker run -d \
    --name litellm-container \
    -p 4000:4000 \
    -v $(pwd)/config.yaml:/app/config.yaml \
    --env-file .env \
    litellm-local \
    --config /app/config.yaml --port 4000

This script builds a Docker image from the LiteLLM GitHub repository, then runs it with our configuration, exposing it on port 4000.

4. Testing the LiteLLM Proxy

You can now interact with your Azure Databricks-hosted Claude models as if they were Anthropic endpoints:

curl -X POST http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "databricks-claude-3-7-sonnet",
        "messages": [{"role": "user", "content": "Hello, Claude!"}]
      }'

If successful, you will see a response similar to below:

{"id":"msg_bdrk_01CHLqspDTBWBawnNBE7ZQMz","created":1752091938,"model":"us.anthropic.claude-3-7-sonnet-20250219-v1:0","object":"chat.completion","system_fingerprint":null,"choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! It's nice to meet you. How can I help you today? I'm ready to assist with information, answer questions, or have a conversation about topics you're interested in.","role":"assistant","tool_calls":null,"function_call":null}}],"usage":{"completion_tokens":41,"prompt_tokens":11,"total_tokens":52,"completion_tokens_details":null,"prompt_tokens_details":null}}%

The beauty of this approach is that the authorization token can be anything you want for local development – here we’re using a placeholder sk-1234.

5. Integration with Claude Code CLI

For those using Anthropic’s powerful Claude Code CLI, you can point it to your local proxy by setting these environment variables & then running Claude Code (ANTHROPIC_MODEL can be any of the models you have referenced in config.yaml above):

export ANTHROPIC_BASE_URL="http://localhost:4000" 
export ANTHROPIC_API_KEY="sk-1234" 
export ANTHROPIC_MODEL="databricks-claude-3-7-sonnet"

claude

If working correctlly, you will see similar to below when you run claude

Screenshot showing Claude environment variables highlighted

This configuration allows Claude Code to seamlessly use the Databricks-hosted Claude models through your local proxy. The Claude Code CLI is an interactive command-line tool that helps with software engineering tasks, making it perfect for developers who prefer terminal-based workflows.

Wrapping up

Setting up LiteLLM as a proxy for Azure Databricks-hosted Claude models has been a significant productivity boost for my AI development workflow. The standardised interface means less time wrestling with different APIs and more time focusing on what matters – building great AI-powered features. The ability to use Claude Code CLI with Databricks-hosted models has been particularly valuable for my terminal-centric development process.

Have you tried using LiteLLM or similar proxy tools with Claude Code in your AI development? I’d love to hear about your experiences

A mixture of Claude Code & GitHub Copilot is truely an awesome experience!

GitHub repository containing the above setup

[ad_2]

Setting Up LiteLLM with Azure Databricks Serving Endpoints for Claude Code

LiteLLM: The Universal Translator for AI APIs

The Setup: A Step-by-Step Guide

1. Azure Databricks setup

2. Setting up and configuring LiteLLM

Configuration file

Environment Setup

3. Dockerised Deployment

4. Testing the LiteLLM Proxy

5. Integration with Claude Code CLI

Wrapping up

Like this:

Related

Projesh Kar

Leave a Comment Cancel reply

Product Highlight

Recent Posts

5 Ways to Find Scholarships and Grants for Grad School in 2025

MPR – MPR Australia

RYM – Ryman Healthcare | Aussie Stock Forums

What’s Coming in the 2025 Release

She Pushed To Overturn Trump’s Loss In The 2020 Election. Now She’ll Help Oversee U.S. Election Security.

Speeding Up Development and Reducing Costs (2025–2030)

South Carolina individual kidnapped, forced to withdraw money from ATM

Credit scores fall year over year, more borrowers miss payments

STOCK TIPS FOR SEP. 17 2025

Kirkland & Ellis Partner Gets Pierced in Court to Seal Claire’s Rescue Kirkland & Ellis Partner Gets Pierced in Court to Seal Claire’s Rescue –

LiteLLM: The Universal Translator for AI APIs

The Setup: A Step-by-Step Guide

1. Azure Databricks setup

2. Setting up and configuring LiteLLM

Configuration file

Environment Setup

3. Dockerised Deployment

4. Testing the LiteLLM Proxy

5. Integration with Claude Code CLI

Wrapping up

Share this:

Like this:

Related

Projesh Kar

Leave a Comment Cancel reply

Product Highlight

Recent Posts