Map-reduce summarization over many chunks
Summarize each chunk independently, then synthesize a single coherent summary — two prompts, structured handoff.
Documents that exceed your context budget — or where single-pass summaries lose detail. Map each chunk to a small summary, then reduce the summaries into one.
The prompt
Copy this verbatim. Replace the {{ … }} placeholders with your values.
<!-- PASS 1 (map): run this once per chunk, in parallel -->
<instructions>
You are summarizing one chunk of a larger document. Produce JSON inside <chunk_summary> tags:
{
"key_claims": ["string"], // up to 5; one fact per item, verbatim where possible
"entities": ["string"], // people, orgs, products mentioned
"open_questions": ["string"], // things this chunk raises but does not answer
"chunk_id": "{{ chunk_id }}"
}
Do not generalize beyond this chunk. Do not reference "the document" — you only see one chunk.
</instructions>
<chunk id="{{ chunk_id }}">{{ chunk_text }}</chunk>
Return inside <chunk_summary> tags.
<!-- PASS 2 (reduce): one call, taking all chunk_summary outputs as input -->
<instructions>
You will receive N per-chunk summaries inside <chunk_summary> tags. Synthesize a final summary.
Output three sections, each in its own tag:
<tldr>One paragraph, ≤120 words.</tldr>
<key_findings>Bulleted, deduplicated across chunks. Cite chunk_id in [c:ID] format.</key_findings>
<unanswered>Bulleted list of open questions that no chunk answered.</unanswered>
Rules:
- Deduplicate claims that appear in multiple chunks.
- Surface contradictions explicitly: "[c:3] states X but [c:7] states Y."
- Do not invent facts not present in any chunk_summary.
</instructions>
{{ chunk_summaries_concatenated }}
Sample input
Pass 1: a 6kb chunk of a research report. Pass 2: 12 chunk_summary blobs from pass 1.
Expected output
<tldr>
The report finds that small fine-tuned retrievers (110M params) match 7B baselines on domain queries
at one-tenth the cost, with most gains coming from query-log diversity rather than model size...
</tldr>
<key_findings>
- Fine-tuned 110M retriever beats 7B baseline by 4 nDCG@10 points on the in-domain test set [c:3, c:8]
- 8× latency reduction with no measurable recall loss [c:4]
- Generalization to long-tail queries is unverified [c:9, c:11]
</key_findings>
<unanswered>
- How does the approach hold up on multilingual queries?
- What is the cost of building the 50k query training set in domains without query logs?
</unanswered>
Notes & tuning tips
- Two prompts, not one. Parallelize pass 1 across chunks; pass 2 runs once on the concatenated outputs.
- Per-chunk JSON output is what makes deduplication tractable in pass 2 — free prose doesn't deduplicate cleanly.
- If you can't run two passes, single-pass summarization with all chunks inline is acceptable up to ~50k tokens; beyond that, accuracy degrades sharply.
- Chunk IDs are load-bearing — they're how the final summary cites back to source positions.
What this example uses
Tags: <instructions> <format>
Patterns: multi document structured output
More like this
Generate → self-critique → revise in one call
Three-stage prompt where Claude drafts, scores its own draft against a rubric, then revises.
complexPlan-then-act with explicit sub-task scaffolding
Two-turn pattern: first turn produces a numbered plan; second turn executes each sub-task and returns structured results.
complexTree-of-thought reasoning with branch scoring
Explore three reasoning paths, score each against criteria, pick the winner — all in one prompt.
complexExtract → validate → transform pipeline in one call
Four-stage data pipeline: extract raw fields, validate against rules, transform to target shape, emit errors.
Map-reduce summarization over many chunks. claudexml.com. https://claudexml.com/examples/map-reduce-summarize/