Skip to main content
POST
/
api
/
generate
curl http://localhost:11434/api/generate -d '{
"model": "gemma3",
"prompt": "Why is the sky blue?"
}'
{
  "model": "gemma3",
  "created_at": "2025-10-17T23:14:07.414671Z",
  "response": "Hello! How can I help you today?",
  "done": true,
  "done_reason": "stop",
  "total_duration": 174560334,
  "load_duration": 101397084,
  "prompt_eval_count": 11,
  "prompt_eval_duration": 13074791,
  "eval_count": 18,
  "eval_duration": 52479709
}

Body

application/json
model
string
required

Model name

prompt
string

Text for the model to generate a response from

suffix
string

Used for fill-in-the-middle models, text that appears after the user prompt and before the model response

images
string[]

Base64-encoded images for models that support image input

format

Structured output format for the model to generate a response from. Supports either the string "json" or a JSON schema object.

system
string

System prompt for the model to generate a response from

stream
boolean
default:true

When true, returns a stream of partial responses

think
boolean

When true, returns separate thinking output in addition to content

raw
boolean

When true, returns the raw response from the model without any prompt templating

keep_alive

Model keep-alive duration (for example 5m or 0 to unload immediately)

options
object

Runtime options that control text generation

Response

Generation responses

model
string

Model name

created_at
string

ISO 8601 timestamp of response creation

response
string

The model's generated text response

thinking
string

The model's generated thinking output

done
boolean

Indicates whether generation has finished

done_reason
string

Reason the generation stopped

total_duration
integer

Time spent generating the response in nanoseconds

load_duration
integer

Time spent loading the model in nanoseconds

prompt_eval_count
integer

Number of input tokens in the prompt

prompt_eval_duration
integer

Time spent evaluating the prompt in nanoseconds

eval_count
integer

Number of output tokens generated in the response

eval_duration
integer

Time spent generating tokens in nanoseconds