Embeddings

Embeddings turn text into numeric vectors you can store in a vector database, search with cosine similarity, or use in RAG pipelines. The vector length depends on the model (typically 384–1024 dimensions).

Recommended models

embeddinggemma
qwen3-embedding
all-minilm

Generate embeddings

Use /api/embed with a single string.

cURL
Python
JavaScript

curl -X POST http://localhost:11434/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "model": "embeddinggemma",
    "input": "The quick brown fox jumps over the lazy dog."
  }'

The /api/embed endpoint returns L2‑normalized (unit‑length) vectors.

Generate a batch of embeddings

Pass an array of strings to input.

cURL
Python
JavaScript

curl -X POST http://localhost:11434/api/embed \
  -H "Content-Type: application/json" \
  -d '{
    "model": "embeddinggemma",
    "input": [
      "First sentence",
      "Second sentence",
      "Third sentence"
    ]
  }'

Tips

Use cosine similarity for most semantic search use cases.
Use the same embedding model for both indexing and querying.

Get started

Reference

Capabilities

Integrations

More information

Recommended models

Generate embeddings

Generate a batch of embeddings

Tips

Get started

Reference

Capabilities

Integrations

More information

​Recommended models

​Generate embeddings

​Generate a batch of embeddings

​Tips

Recommended models

Generate embeddings

Generate a batch of embeddings

Tips