Anthropic compatibility

Ollama provides compatibility with the Anthropic Messages API to help connect existing applications to Ollama, including tools like Claude Code.

Usage

Environment variables

To use Ollama with tools that expect the Anthropic API (like Claude Code), set these environment variables:

export ANTHROPIC_AUTH_TOKEN=ollama  # required but ignored
export ANTHROPIC_API_KEY="" # required but ignored
export ANTHROPIC_BASE_URL=http://localhost:11434

Simple `/v1/messages` example

basic.py

import anthropic

client = anthropic.Anthropic(
    base_url='http://localhost:11434',
    api_key='ollama',  # required but ignored
)

message = client.messages.create(
    model='qwen3-coder',
    max_tokens=1024,
    messages=[
        {'role': 'user', 'content': 'Hello, how are you?'}
    ]
)
print(message.content[0].text)

Streaming example

streaming.py

import anthropic

client = anthropic.Anthropic(
    base_url='http://localhost:11434',
    api_key='ollama',
)

with client.messages.stream(
    model='qwen3-coder',
    max_tokens=1024,
    messages=[{'role': 'user', 'content': 'Count from 1 to 10'}]
) as stream:
    for text in stream.text_stream:
        print(text, end='', flush=True)

Tool calling example

tools.py

import anthropic

client = anthropic.Anthropic(
    base_url='http://localhost:11434',
    api_key='ollama',
)

message = client.messages.create(
    model='qwen3-coder',
    max_tokens=1024,
    tools=[
        {
            'name': 'get_weather',
            'description': 'Get the current weather in a location',
            'input_schema': {
                'type': 'object',
                'properties': {
                    'location': {
                        'type': 'string',
                        'description': 'The city and state, e.g. San Francisco, CA'
                    }
                },
                'required': ['location']
            }
        }
    ],
    messages=[{'role': 'user', 'content': "What's the weather in San Francisco?"}]
)

for block in message.content:
    if block.type == 'tool_use':
        print(f'Tool: {block.name}')
        print(f'Input: {block.input}')

Using with Claude Code

Claude Code can be configured to use Ollama as its backend.

Recommended models

For coding use cases, models like glm-4.7, minimax-m2.1, and qwen3-coder are recommended. Download a model before use:

ollama pull qwen3-coder

Note: Qwen 3 coder is a 30B parameter model requiring at least 24GB of VRAM to run smoothly. More is required for longer context lengths.

ollama pull glm-4.7:cloud

Quick setup

ollama launch claude

This will prompt you to select a model, configure Claude Code automatically, and launch it. To configure without launching:

ollama launch claude --config

Manual setup

Set the environment variables and run Claude Code:

ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_API_KEY="" claude --model qwen3-coder

Or set the environment variables in your shell profile:

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_API_KEY=""

Then run Claude Code with any Ollama model:

claude --model qwen3-coder

Endpoints

`/v1/messages`

Supported features

Supported request fields

Supported response fields

Streaming events

Models

Ollama supports both local and cloud models.

Local models

Pull a local model before use:

ollama pull qwen3-coder

Recommended local models:

qwen3-coder - Excellent for coding tasks
gpt-oss:20b - Strong general-purpose model

Cloud models

Cloud models are available immediately without pulling:

glm-4.7:cloud - High-performance cloud model
minimax-m2.1:cloud - Fast cloud model

Default model names

For tooling that relies on default Anthropic model names such as claude-3-5-sonnet, use ollama cp to copy an existing model name:

ollama cp qwen3-coder claude-3-5-sonnet

Afterwards, this new model name can be specified in the model field:

curl http://localhost:11434/v1/messages \
    -H "Content-Type: application/json" \
    -d '{
        "model": "claude-3-5-sonnet",
        "max_tokens": 1024,
        "messages": [
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'

Differences from the Anthropic API

Behavior differences

API key is accepted but not validated
anthropic-version header is accepted but not used
Token counts are approximations based on the underlying model’s tokenizer

Not supported

The following Anthropic API features are not currently supported:

Feature	Description
`/v1/messages/count_tokens`	Token counting endpoint
`tool_choice`	Forcing specific tool use or disabling tools
`metadata`	Request metadata (user_id)
Prompt caching	`cache_control` blocks for caching prefixes
Batches API	`/v1/messages/batches` for async batch processing
Citations	`citations` content blocks
PDF support	`document` content blocks with PDF files
Server-sent errors	`error` events during streaming (errors return HTTP status)

Partial support

Feature	Status
Image content	Base64 images supported; URL images not supported
Extended thinking	Basic support; `budget_tokens` accepted but not enforced

API Reference

Endpoints

​Usage

​Environment variables

​Simple /v1/messages example

​Streaming example

​Tool calling example

​Using with Claude Code

​Recommended models

​Quick setup

​Manual setup

​Endpoints

​/v1/messages

​Supported features

​Supported request fields

​Supported response fields

​Streaming events

​Models

​Local models

​Cloud models

​Default model names

​Differences from the Anthropic API

​Behavior differences

​Not supported

​Partial support

Usage

Environment variables

Simple `/v1/messages` example

Streaming example

Tool calling example

Using with Claude Code

Recommended models

Quick setup

Manual setup

Endpoints

`/v1/messages`

Supported features

Supported request fields

Supported response fields

Streaming events

Models

Local models

Cloud models

Default model names

Differences from the Anthropic API

Behavior differences

Not supported

Partial support