Document objects that can be chunked, embedded, and stored in your knowledge base. Each reader handles a specific format (PDF, CSV, Markdown, etc.) and extracts text and metadata.
How Readers Work
- Parse: Read the raw content using format-specific logic
- Extract: Pull out text and metadata (page numbers, authors, etc.)
- Chunk: Split large content into smaller pieces (if enabled)
- Return: Provide a list of
Documentobjects ready for embedding
Supported Readers
| Reader | Description |
|---|---|
PDFReader | Extract text from PDF files |
TextReader | Plain text files |
MarkdownReader | Markdown files |
CSVReader | CSV files (rows become documents) |
FieldLabeledCSVReader | CSV rows as field-labeled text |
JSONReader | JSON files |
PPTXReader | PowerPoint presentations |
ArxivReader | Academic papers from arXiv |
WikipediaReader | Wikipedia articles |
YouTubeReader | YouTube transcripts |
WebsiteReader | Crawl websites recursively |
WebSearchReader | Web search results |
FirecrawlReader | Web scraping via Firecrawl API |
Using Readers with Knowledge
Pass a reader toknowledge.insert() to override automatic format detection:
Auto-Selection
Agno automatically selects the right reader based on file extension or URL:knowledge.insert(), this happens automatically.
Configuration
Chunking
Format-Specific Options
Runtime Options
Override settings when callingread():