Body
Model name
Chat history as an array of message objects (each with a role and content)
Optional list of function tools the model may call during the chat
Format to return a response in. Can be json or a JSON schema
json Runtime options that control text generation
When true, returns separate thinking output in addition to content
Model keep-alive duration (for example 5m or 0 to unload immediately)
Response
Chat response
Model name used to generate this message
Timestamp of response creation (ISO 8601)
Indicates whether the chat response has finished
Reason the response finished
Total time spent generating in nanoseconds
Time spent loading the model in nanoseconds
Number of tokens in the prompt
Time spent evaluating the prompt in nanoseconds
Number of tokens generated in the response
Time spent generating tokens in nanoseconds

