Skip to main content
POST
/
api
/
chat
curl http://localhost:11434/api/chat -d '{
"model": "gemma3",
"messages": [
{
"role": "user",
"content": "why is the sky blue?"
}
]
}'
{
  "model": "gemma3",
  "created_at": "2025-10-17T23:14:07.414671Z",
  "message": {
    "role": "assistant",
    "content": "Hello! How can I help you today?"
  },
  "done": true,
  "done_reason": "stop",
  "total_duration": 174560334,
  "load_duration": 101397084,
  "prompt_eval_count": 11,
  "prompt_eval_duration": 13074791,
  "eval_count": 18,
  "eval_duration": 52479709
}

Body

application/json
model
string
required

Model name

messages
object[]
required

Chat history as an array of message objects (each with a role and content)

tools
object[]

Optional list of function tools the model may call during the chat

format

Format to return a response in. Can be json or a JSON schema

Available options:
json
options
object

Runtime options that control text generation

stream
boolean
default:true
think
boolean

When true, returns separate thinking output in addition to content

keep_alive

Model keep-alive duration (for example 5m or 0 to unload immediately)

Response

Chat response

model
string

Model name used to generate this message

created_at
string<date-time>

Timestamp of response creation (ISO 8601)

message
object
done
boolean

Indicates whether the chat response has finished

done_reason
string

Reason the response finished

total_duration
integer

Total time spent generating in nanoseconds

load_duration
integer

Time spent loading the model in nanoseconds

prompt_eval_count
integer

Number of tokens in the prompt

prompt_eval_duration
integer

Time spent evaluating the prompt in nanoseconds

eval_count
integer

Number of tokens generated in the response

eval_duration
integer

Time spent generating tokens in nanoseconds