Skip to main content
Embeddings turn text into numeric vectors you can store in a vector database, search with cosine similarity, or use in RAG pipelines. The vector length depends on the model (typically 384–1024 dimensions).

Generate embeddings

Use /api/embed with a single string.
  • cURL
  • Python
  • JavaScript
curl -X POST http://localhost:11434/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "model": "embeddinggemma",
    "input": "The quick brown fox jumps over the lazy dog."
  }'
The /api/embed endpoint returns L2‑normalized (unit‑length) vectors.

Generate a batch of embeddings

Pass an array of strings to input.
  • cURL
  • Python
  • JavaScript
curl -X POST http://localhost:11434/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "model": "embeddinggemma",
    "input": [
      "First sentence",
      "Second sentence",
      "Third sentence"
    ]
  }'

Tips

  • Use cosine similarity for most semantic search use cases.
  • Use the same embedding model for both indexing and querying.
I