Embedders convert text into vectors (lists of numbers) that capture meaning. These vectors enable semantic search, so “How do I reset my passcode?” finds documents mentioning “change PIN” even without keyword matches.
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.embedder.openai import OpenAIEmbedder
from agno.vectordb.pgvector import PgVector
knowledge = Knowledge(
vector_db=PgVector(
table_name="docs",
db_url=db_url,
embedder=OpenAIEmbedder(), # Default
),
)
How It Works
- Insert: When you add content, each chunk is converted to a vector
- Store: Vectors are saved in your vector database
- Search: Queries are embedded and matched against stored vectors by similarity
Agno uses OpenAIEmbedder by default, but you can swap in any supported embedder.
Configuration
from agno.knowledge.embedder.openai import OpenAIEmbedder
embedder = OpenAIEmbedder(
id="text-embedding-3-small",
dimensions=1536,
)
Using with Knowledge
from agno.knowledge.knowledge import Knowledge
from agno.vectordb.pgvector import PgVector
knowledge = Knowledge(
vector_db=PgVector(
table_name="docs",
db_url=db_url,
embedder=OpenAIEmbedder(id="text-embedding-3-small"),
),
)
# Content is embedded automatically on insert
knowledge.insert(path="documents/")
Batch Embeddings
Process multiple texts in a single API call to reduce requests and improve performance:
embedder = OpenAIEmbedder(
id="text-embedding-3-small",
dimensions=1536,
enable_batch=True,
batch_size=100,
)
Embedders with batch support: OpenAI, Azure OpenAI, Gemini, Cohere, Voyage AI, Mistral, Fireworks, Together, Jina, Nebius.
Best Practices
Re-embed when changing models: Vectors from different embedders aren’t compatible. If you switch embedders, you must re-embed all content.
Test retrieval quality: Use sample queries to verify you’re finding the right chunks. Adjust chunking strategy or embedder if results are poor.
Match dimensions: Ensure your embedder’s output dimensions match what your vector database expects.
Supported Embedders
| Embedder | Type | Cost | Notes |
|---|
| OpenAI | Hosted | $$ | Default, excellent quality |
| Gemini | Hosted | $$ | Multilingual, Google ecosystem |
| Cohere | Hosted | $$ | Strong retrieval performance |
| Voyage AI | Hosted | $$$ | Specialized for retrieval |
| Mistral | Hosted | $$ | European provider |
| Ollama | Local | Free | Privacy, offline |
| FastEmbed | Local | Free | Fast local embeddings |
| HuggingFace | Local/Hosted | Free/$ | Open source models |
| AWS Bedrock | Hosted | $$ | AWS ecosystem |
| Azure OpenAI | Hosted | $$ | Azure ecosystem |
| Fireworks | Hosted | $ | Fast inference |
| Together | Hosted | $ | Open source models |
| Jina | Hosted | $$ | Multilingual |
| Nebius | Hosted | $ | European provider |
Choosing an Embedder
| Consideration | Recommendation |
|---|
| General use | OpenAI or Gemini |
| Privacy/offline | Ollama or FastEmbed |
| Multilingual | Gemini or Jina |
| Cost-sensitive | Local embedders (free) or Fireworks/Together ($) |
| Best retrieval quality | Voyage AI or Cohere |
Key factors:
- Hosted vs local: Local for privacy and no API costs; hosted for quality and convenience
- Latency and cost: Smaller models are cheaper and faster; larger models often retrieve better
- Language support: Ensure your embedder supports your content’s languages
- Dimension size: Match your vector database’s expected embedding dimensions
Next Steps