Skip to main content

8. Chunking Strategies

Chunking decides what a "retrievable unit" is. Because retrieval works on whole chunks, a fact split across two chunks — or buried in a giant one — becomes hard to surface. This is the single highest-leverage knob in RAG.

The strategies at a glance

Fixed |■■■■|■■■■|■■■■| every N chars — simple, cuts sentences
Recursive |■■■ |■■■■■|■■ | split on ¶ → line → sentence — structure-aware ✅
Semantic |■■■■■|■■|■■■■■| split where meaning shifts — coherent, costly
Agentic | LLM decides | human-like splits — best quality, slowest
StrategyBest for
Fixedquick prototypes, uniform text
Recursivemost production RAG (start here)
Semanticwhen recursive plateaus and budget allows
Agenticmessy/mixed docs where quality is critical

Size and overlap

A solid default: ~400–512 tokens per chunk with 10–20% overlap. Overlap means consecutive chunks share a little text so a fact sitting on a boundary still appears whole in at least one chunk.

chunk A: [........ overlap]
chunk B: [overlap ........]
└─ shared so boundary facts survive

Code — the default

from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
chunk_size=512,
chunk_overlap=64, # ~12%
separators=["\n\n", "\n", ". ", " ", ""], # coarse → fine
)
chunks = splitter.create_documents([text])

Chunk by meaning, keep a small overlap, attach metadata. Tune size against a real eval set — not by eyeballing.

The next three parts go deeper into splitting techniques.

Next: Advanced Text Splitting →