The AI Concepts Podcast

Module 6: RAG | Chunking - Where You Cut Decides What Gets Found

10 min · 29. huhti 2026
jakson Module 6: RAG | Chunking - Where You Cut Decides What Gets Found kansikuva

Kuvaus

This episode is about chunking, the quiet step in a RAG pipeline that decides whether your system retrieves the right answer or a confidently wrong one. It covers why the chunk is the real unit of retrieval, the tradeoff between context and precision, the main strategies teams use to split documents, and why testing your chunks against real questions matters more than picking the perfect size.

Kommentit

0

Ole ensimmäinen kommentoija

Rekisteröidy nyt ja liity The AI Concepts Podcast-yhteisöön!

Aloita maksutta

14 vrk ilmainen kokeilu

Kokeilun jälkeen 7,99 € / kuukausi. · Peru milloin tahansa.

  • Podimon podcastit
  • 20 kuunteluaikaa / kuukausi
  • Lataa offline-käyttöön

Kaikki jaksot

73 jaksot

jakson Module 6: RAG | GraphRAG - When Relationships Matter More Than Text kansikuva

Module 6: RAG | GraphRAG - When Relationships Matter More Than Text

This episode addresses the category of questions that vector search fundamentally cannot answer, questions about relationships between things. We explore what a knowledge graph is and why traversing connections between entities requires a completely different data structure than semantic similarity search. We break down Microsoft's GraphRAG approach, how it extracts entities and relationships from documents during indexing, uses community detection to identify clusters of related knowledge, and generates summaries that enable global queries across an entire corpus rather than just local document retrieval. We cover the cost improvements brought by LazyGraphRAG, the hybrid vector-plus-graph pattern most production teams are moving toward, Neo4j as the go-to graph database, and a lighter-weight entity extraction approach for teams not ready for a full knowledge graph. By the end you will understand when relationships matter more than text and how to build systems that can answer both kinds of questions.

Eilen8 min
jakson Module 6: RAG | Query Transformation - When the Question Is the Bottleneck kansikuva

Module 6: RAG | Query Transformation - When the Question Is the Bottleneck

This episode addresses a retrieval failure that has nothing to do with your index and everything to do with the query itself. We explore the vocabulary gap between how people ask questions and how documents are written, and why even strong embedding models cannot always bridge it. We break down three techniques that fix the query before the search runs: query rewriting to reformulate casual language into formal search terms, HyDE which generates a hypothetical answer and uses that as the search query instead of the question, and multi-query expansion which generates multiple phrasings to cast a wider retrieval net. We also cover step-back prompting for queries that need broader conceptual grounding before searching. By the end you will understand why the question itself is often the highest-leverage thing to improve in a retrieval pipeline.

Eilen7 min
jakson Module 6: RAG | Parent-Child Indexing - Search Small, Retrieve Big kansikuva

Module 6: RAG | Parent-Child Indexing - Search Small, Retrieve Big

This episode addresses the fundamental tension between retrieval precision and generation context. We explore why small chunks produce tight embeddings that retrieve well but leave the model without enough surrounding information, and why large chunks give the model context but dilute the embedding and hurt search quality. We break down parent-child indexing as the solution that decouples these two problems entirely, how child chunks handle the search and parent chunks handle the generation, and how to structure the hierarchy for documents of different complexity. We cover practical implementations in LlamaIndex and LangChain and close with guidance on when this pattern earns its place in a pipeline. By the end you will understand how to stop choosing between finding the right thing and giving the model enough to work with.

Eilen7 min
jakson Module 6: RAG | Reranking - The Second Stage That Gets Retrieval Right kansikuva

Module 6: RAG | Reranking - The Second Stage That Gets Retrieval Right

This episode addresses the gap between finding candidate chunks and finding the right ones. We explore the bi-encoder bottleneck, why compressing text into a single vector for comparison loses critical nuance, and how cross-encoders fix this by reading the query and document together in a single forward pass. We introduce ColBERT as a powerful middle ground between speed and accuracy through token-level late interaction, walk through the production tooling landscape including Cohere Rerank, BGE models, and RAGatouille, and close by stitching hybrid search and reranking into a complete three-stage retrieval funnel. By the end you will understand why two-stage retrieval is now the standard architecture for any serious RAG pipeline.

Eilen9 min