Skip to main content
POST
/
api
/
chat
curl http://localhost:11434/api/chat -d '{
"model": "gemma3",
"messages": [
{
"role": "user",
"content": "why is the sky blue?"
}
]
}'
{
  "model": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "message": {
    "role": "assistant",
    "content": "<string>",
    "thinking": "<string>",
    "tool_calls": [
      {
        "function": {
          "name": "<string>",
          "description": "<string>",
          "arguments": {}
        }
      }
    ],
    "images": [
      "<string>"
    ]
  },
  "done": true,
  "done_reason": "<string>",
  "total_duration": 123,
  "load_duration": 123,
  "prompt_eval_count": 123,
  "prompt_eval_duration": 123,
  "eval_count": 123,
  "eval_duration": 123,
  "logprobs": [
    {
      "token": "<string>",
      "logprob": 123,
      "bytes": [
        123
      ],
      "top_logprobs": [
        {
          "token": "<string>",
          "logprob": 123,
          "bytes": [
            123
          ]
        }
      ]
    }
  ]
}

Body

application/json
model
string
required

Model name

messages
object[]
required

Chat history as an array of message objects (each with a role and content)

tools
object[]

Optional list of function tools the model may call during the chat

format

Format to return a response in. Can be json or a JSON schema

Available options:
json
options
object

Runtime options that control text generation

stream
boolean
default:true
think

When true, returns separate thinking output in addition to content. Can be a boolean (true/false) or a string ("high", "medium", "low") for supported models.

keep_alive

Model keep-alive duration (for example 5m or 0 to unload immediately)

logprobs
boolean

Whether to return log probabilities of the output tokens

top_logprobs
integer

Number of most likely tokens to return at each token position when logprobs are enabled

Response

Chat response

model
string

Model name used to generate this message

created_at
string<date-time>

Timestamp of response creation (ISO 8601)

message
object
done
boolean

Indicates whether the chat response has finished

done_reason
string

Reason the response finished

total_duration
integer

Total time spent generating in nanoseconds

load_duration
integer

Time spent loading the model in nanoseconds

prompt_eval_count
integer

Number of tokens in the prompt

prompt_eval_duration
integer

Time spent evaluating the prompt in nanoseconds

eval_count
integer

Number of tokens generated in the response

eval_duration
integer

Time spent generating tokens in nanoseconds

logprobs
object[]

Log probability information for the generated tokens when logprobs are enabled