Recommended models
For coding use cases, models likeglm-4.7:cloud, minimax-m2.1:cloud, and qwen3-coder are recommended.
Pull a model before use:
Usage
Environment variables
To use Ollama with tools that expect the Anthropic API (like Claude Code), set these environment variables:Simple /v1/messages example
basic.py
Streaming example
streaming.py
Tool calling example
tools.py
Using with Claude Code
Claude Code can be configured to use Ollama as its backend:Endpoints
/v1/messages
Supported features
- Messages
- Streaming
- System prompts
- Multi-turn conversations
- Vision (images)
- Tools (function calling)
- Tool results
- Thinking/extended thinking
Supported request fields
-
model -
max_tokens -
messages- Text
content - Image
content(base64) - Array of content blocks
-
tool_useblocks -
tool_resultblocks -
thinkingblocks
- Text
-
system(string or array) -
stream -
temperature -
top_p -
top_k -
stop_sequences -
tools -
thinking -
tool_choice -
metadata
Supported response fields
-
id -
type -
role -
model -
content(text, tool_use, thinking blocks) -
stop_reason(end_turn, max_tokens, tool_use) -
usage(input_tokens, output_tokens)
Streaming events
-
message_start -
content_block_start -
content_block_delta(text_delta, input_json_delta, thinking_delta) -
content_block_stop -
message_delta -
message_stop -
ping -
error
Models
Ollama supports both local and cloud models.Local models
Pull a local model before use:qwen3-coder- Excellent for coding tasksgpt-oss:20b- Strong general-purpose model
Cloud models
Cloud models are available immediately without pulling:glm-4.7:cloud- High-performance cloud modelminimax-m2.1:cloud- Fast cloud model
Default model names
For tooling that relies on default Anthropic model names such asclaude-3-5-sonnet, use ollama cp to copy an existing model name:
model field:
Differences from the Anthropic API
Behavior differences
- API key is accepted but not validated
anthropic-versionheader is accepted but not used- Token counts are approximations based on the underlying model’s tokenizer
Not supported
The following Anthropic API features are not currently supported:| Feature | Description |
|---|---|
/v1/messages/count_tokens | Token counting endpoint |
tool_choice | Forcing specific tool use or disabling tools |
metadata | Request metadata (user_id) |
| Prompt caching | cache_control blocks for caching prefixes |
| Batches API | /v1/messages/batches for async batch processing |
| Citations | citations content blocks |
| PDF support | document content blocks with PDF files |
| Server-sent errors | error events during streaming (errors return HTTP status) |
Partial support
| Feature | Status |
|---|---|
| Image content | Base64 images supported; URL images not supported |
| Extended thinking | Basic support; budget_tokens accepted but not enforced |

