Search & Retrieval

When an agent needs information, it searches for relevant chunks rather than loading everything into the prompt. This keeps responses focused and efficient.

from agno.knowledge.knowledge import Knowledge
from agno.vectordb.pgvector import PgVector
from agno.vectordb.search import SearchType

knowledge = Knowledge(
    vector_db=PgVector(
        table_name="embeddings",
        db_url=db_url,
        search_type=SearchType.hybrid,
    ),
    max_results=5,
)

results = knowledge.search("What's our return policy?")

How Search Works

Query Analysis

The agent analyzes the user’s question to understand what information would help.

Search Execution

The system runs vector, keyword, or hybrid search based on configuration.

Retrieval

The knowledge base returns the most relevant content chunks.

Response Generation

Retrieved information is combined with the question to generate a response.

Search Types

Vector Search

Finds content by meaning, not exact words. When you ask “How do I reset my password?”, it finds documents about “changing credentials” even if those exact words don’t appear.

vector_db = PgVector(
    table_name="embeddings",
    db_url=db_url,
    search_type=SearchType.vector,
)

Best for: Conceptual questions where users phrase things differently than your docs.

Keyword Search

Classic text search that matches exact words and phrases. Uses your database’s full-text search or keyword matching capabilities.

vector_db = PgVector(
    table_name="embeddings",
    db_url=db_url,
    search_type=SearchType.keyword,
)

Best for: Specific terms, product names, error codes, technical identifiers.

Hybrid Search

Combines vector similarity with keyword matching. Usually the best choice for production.

from agno.knowledge.reranker.cohere import CohereReranker

vector_db = PgVector(
    table_name="embeddings",
    db_url=db_url,
    search_type=SearchType.hybrid,
    reranker=CohereReranker(),  # Optional: improves result ordering
)

Best for: Most real-world applications where you want both semantic understanding and exact-match precision.

Start with hybrid search and add a reranker for best results.

Agentic vs Traditional RAG

Traditional RAG always searches with the exact user query and injects results into the prompt. Agentic RAG lets the agent decide when to search, reformulate queries, and run follow-up searches if needed.

Traditional RAG
Agentic RAG

# Always searches, always injects results
results = knowledge.search(user_query)
context = "\n\n".join([d.content for d in results])
response = llm.generate(user_query + "\n" + context)

from agno.agent import Agent

# Agent decides when to search
agent = Agent(
    knowledge=knowledge,
    search_knowledge=True,  # Agent calls search_knowledge_base tool when needed
)

agent.print_response("What's our return policy?")

With Agentic RAG, the agent can:

Skip searching when it already knows the answer
Reformulate queries for better results
Run multiple searches to gather complete information
Combine results from different searches

Filtering Results

Filter searches by metadata to target specific content:

# Add content with metadata
knowledge.insert(
    path="policies/",
    metadata={"department": "hr", "type": "policy", "year": 2024}
)

# Search with filters
results = knowledge.search(
    query="vacation policy",
    filters={"department": "hr", "type": "policy"}
)

# Use filters with agents
agent.print_response(
    "What's our vacation policy?",
    knowledge_filters={"department": "hr"}
)

For complex filtering with OR, NOT, and comparisons, see Filtering.

Custom Retrieval Logic

Override the default search behavior with a custom retriever:

async def my_retriever(query: str, num_documents: int = 5, filters: dict = None, **kwargs):
    # Reformulate query
    expanded_query = query.replace("vacation", "paid time off PTO")

    # Run search
    docs = await knowledge.asearch(expanded_query, max_results=num_documents, filters=filters)

    return [d.to_dict() for d in docs]

agent = Agent(
    knowledge=knowledge,
    knowledge_retriever=my_retriever,
)

Improving Search Quality

Chunk Size

How you split content affects retrieval precision:

Chunk Size	Trade-off
Small (1000-3000 chars)	More precise, but may miss context
Default (5000 chars)	Balanced precision and context
Large (8000+ chars)	More context, but less targeted
Semantic chunking	Splits at natural topic boundaries

Embedding Model

Your embedder converts text into vectors that capture meaning. The right choice depends on your content:

Type	Use Case
General-purpose (OpenAI, Gemini)	Works well for most content
Domain-specific	Better for specialized fields like medical or legal
Multilingual	Required for non-English or mixed-language content

See Embedders for available options.

Metadata

Rich metadata enables better filtering:

# Good: specific, consistent, filterable
metadata = {
    "department": "engineering",
    "document_type": "runbook",
    "service": "payments",
    "last_updated": "2024-01-15",
}

# Bad: vague, inconsistent
metadata = {"type": "doc", "id": "12345"}

Content Structure

Well-organized content searches better:

Use clear headings and sections
Include relevant terminology naturally
Add summaries at the top of long documents
Use descriptive filenames (hr_vacation_policy_2024.pdf not document1.pdf)

Testing

Test with real queries to validate search quality:

test_queries = [
    "What's our vacation policy?",
    "How do I submit expenses?",
    "Remote work guidelines",
]

for query in test_queries:
    results = knowledge.search(query)
    print(f"{query} -> {results[0].content[:100]}..." if results else "No results")

Next Steps

Hybrid Search

Deep dive into combining vector and keyword search

Filtering

Filter results by metadata

Vector DB

Storage options for embeddings

Performance Tips

Optimize for speed and accuracy

Get Started

Basics

Advanced

Other

Search & Retrieval

How Search Works

Search Types

Vector Search

Keyword Search

Hybrid Search

Agentic vs Traditional RAG

Filtering Results

Custom Retrieval Logic

Improving Search Quality

Chunk Size

Embedding Model

Metadata

Content Structure

Testing

Next Steps

Hybrid Search

Filtering

Vector DB

Performance Tips

Get Started

Basics

Advanced

Other

​How Search Works

​Search Types

​Vector Search

​Keyword Search

​Hybrid Search

​Agentic vs Traditional RAG

​Filtering Results

​Custom Retrieval Logic

​Improving Search Quality

​Chunk Size

​Embedding Model

​Metadata

​Content Structure

​Testing

​Next Steps

Hybrid Search

Filtering

Vector DB

Performance Tips

How Search Works

Search Types

Vector Search

Keyword Search

Hybrid Search

Agentic vs Traditional RAG

Filtering Results

Custom Retrieval Logic

Improving Search Quality

Chunk Size

Embedding Model

Metadata

Content Structure

Testing

Next Steps