11. Agentic Chunking
Agentic chunking hands the splitting decision to an LLM, which reads the document and groups it into self-contained, meaningful units the way a human editor would — often rewriting each unit into a clear, standalone "proposition."
How it works
document ─▶ LLM ─▶ "break this into self-contained idea units"
│
▼
["unit 1: ...", "unit 2: ...", ...] each coherent + standalone
Unlike semantic chunking (which only measures sentence similarity), the LLM can reorganize, merge, and reword so each chunk stands on its own without surrounding context.
Code — sketch
def agentic_chunk(text):
prompt = (
"Split the document into self-contained chunks. Each chunk should cover "
"one idea and make sense on its own. Return a JSON list of strings.\n\n"
f"Document:\n{text}"
)
return json.loads(llm(prompt)) # list of standalone chunks
The trade-off
| Agentic | Semantic | Recursive | |
|---|---|---|---|
| Quality | highest | high | good |
| Cost | highest (LLM per doc) | medium | lowest |
| Speed | slowest | medium | fastest |
When it's worth it
Reserve agentic chunking for messy or mixed-format documents where retrieval quality is critical and the corpus is small enough that an LLM pass per document is affordable. For large or uniform corpora, recursive (or semantic) wins on cost.
Quality ladder: recursive → semantic → agentic. Climb only as far as your eval set and budget justify.
Next: Multi-Modal RAG →