Ollama’s API responses include metrics that can be used for measuring performance and model usage:Documentation Index
Fetch the complete documentation index at: https://docs.ollama.com/llms.txt
Use this file to discover all available pages before exploring further.
total_duration: How long the response took to generateload_duration: How long the model took to loadprompt_eval_count: How many input tokens were processedprompt_eval_duration: How long it took to evaluate the prompteval_count: How many output tokens were processeseval_duration: How long it took to generate the output tokens
Example response
For endpoints that return usage metrics, the response body will include the usage fields. For example, a non-streaming call to/api/generate may return the following response:
done is true.
