Filter by task:
Question Task 5.1

What is a "case facts block" and why is it injected outside the summarized history?

click or press Space to flip
Answer Task 5.1

A persistent structured object holding transactional facts — amounts, dates, order numbers, statuses — injected verbatim into every prompt. Summarization turns "$47.20" into "large amount"; keeping it outside history prevents precision loss.

← / → to navigate
Question Task 5.1

What is the "lost in the middle" effect and how do you counteract it?

click or press Space to flip
Answer Task 5.1

Models reliably process content at the start and end of long inputs but may skip material buried in the middle. Counteract by placing key findings first and organizing detailed results with explicit section headers.

← / → to navigate
Question Task 5.1

Why trim verbose tool output before appending it to context?

click or press Space to flip
Answer Task 5.1

A single order lookup can return 40+ fields when only 5 matter. Untrimmed outputs accumulate tokens disproportionately to their relevance, degrading context efficiency across turns.

← / → to navigate
Question Task 5.1

How do you handle context for a session with multiple distinct customer issues?

click or press Space to flip
Answer Task 5.1

Extract and persist structured issue data (order IDs, amounts, statuses) into a separate context layer for each issue — preventing distinct cases from blurring together across turns.

← / → to navigate
Question Task 5.1

What should upstream subagents return when downstream agents have limited context budgets?

click or press Space to flip
Answer Task 5.1

Structured data — key facts, citations, relevance scores — rather than verbose content and reasoning chains. This includes metadata like dates and source locations for accurate downstream synthesis.

← / → to navigate
Question Task 5.2

What are the three legitimate escalation triggers for a customer support agent?

click or press Space to flip
Answer Task 5.2

(1) Customer explicitly requests a human. (2) Policy is silent or ambiguous on the specific request. (3) Agent cannot make meaningful progress. Sentiment and confidence scores are not triggers.

← / → to navigate
Question Task 5.2

Why are sentiment analysis and self-reported confidence scores unreliable escalation signals?

click or press Space to flip
Answer Task 5.2

Sentiment measures emotional tone, not case complexity. LLM self-reported confidence is poorly calibrated — the agent is already incorrectly confident on the hard cases it escalates incorrectly.

← / → to navigate
Question Task 5.2

A customer says "I want to speak to a manager." What does the agent do first?

click or press Space to flip
Answer Task 5.2

Route to a human agent immediately — no investigation, no attempt to resolve. Honoring an explicit human request without delay is a hard rule encoded in the system prompt.

← / → to navigate
Question Task 5.2

A frustrated customer has a delayed shipment the agent can resolve. What does the agent do?

click or press Space to flip
Answer Task 5.2

Acknowledge frustration and offer resolution — the issue is within the agent's capability. Escalate only if the customer reiterates their preference for a human agent.

← / → to navigate
Question Task 5.2

A customer lookup returns multiple matching records. What should the agent do?

click or press Space to flip
Answer Task 5.2

Ask for an additional identifier — email, order number, or ZIP code — to disambiguate. Never select a customer record by heuristic when multiple matches exist.

← / → to navigate
Question Task 5.3

What four fields should a structured error context include?

click or press Space to flip
Answer Task 5.3

failure_type · attempted_query (the exact query tried) · partial_results (what was found before failure) · alternatives (other approaches to try). The coordinator needs all four to recover intelligently.

← / → to navigate
Question Task 5.3

What is the difference between an access failure and a valid empty result?

click or press Space to flip
Answer Task 5.3

Access failure: the query couldn't run (timeout, permission denied) — a retry decision is needed. Valid empty result: the query ran successfully but matched nothing — no retry needed, the absence is the answer.

← / → to navigate
Question Task 5.3

Why is returning a generic "search unavailable" status an anti-pattern?

click or press Space to flip
Answer Task 5.3

It hides the failure type, the attempted query, and partial results. The coordinator cannot make an intelligent recovery decision — retry, reroute, proceed partial, or escalate — without this context.

← / → to navigate
Question Task 5.3

When should a subagent handle an error locally vs. propagate it to the coordinator?

click or press Space to flip
Answer Task 5.3

Handle locally only transient failures the subagent can resolve (brief timeout — retry once). Propagate errors the subagent cannot resolve, always including what was attempted and any partial results.

← / → to navigate
Question Task 5.4

What is context degradation in extended codebase exploration sessions?

click or press Space to flip
Answer Task 5.4

After many turns, the model starts giving inconsistent answers and citing "typical patterns" rather than specific classes or structures it actually discovered earlier in the session.

← / → to navigate
Question Task 5.4

What is the scratchpad file pattern and why is it used?

click or press Space to flip
Answer Task 5.4

A file (e.g., findings.md) on disk where agents write key findings as they discover them. Re-read before subsequent questions to counteract context degradation and prevent drift back to "typical patterns."

← / → to navigate
Question Task 5.4

Why delegate verbose codebase exploration to subagents instead of the main coordinator?

click or press Space to flip
Answer Task 5.4

Raw file listings and dependency traces stay in the subagent's isolated context. The coordinator receives only a compact summary — its context stays uncluttered for high-level coordination.

← / → to navigate
Question Task 5.4

What is /compact used for in Claude Code?

click or press Space to flip
Answer Task 5.4

Reduces context usage when an extended exploration session has filled the context window with verbose discovery output. Allows the session to continue without hitting context limits.

← / → to navigate
Question Task 5.4

How does crash recovery work in a multi-agent codebase exploration?

click or press Space to flip
Answer Task 5.4

Each agent exports state to a known location. On resume, the coordinator loads a structured manifest and injects agent state into prompts — the exploration resumes without restarting from scratch.

← / → to navigate
Question Task 5.5

Why is "97% aggregate accuracy" insufficient before reducing human review?

click or press Space to flip
Answer Task 5.5

A 97% overall metric can hide one document type or field failing at 71%. Always analyze accuracy by document type and by field before reducing human review on any segment.

← / → to navigate
Question Task 5.5

What is stratified random sampling in human review workflows?

click or press Space to flip
Answer Task 5.5

Sampling high-confidence auto-accepted extractions by document type and field (not uniformly) to measure ongoing error rates and detect novel error patterns before they propagate across large volumes.

← / → to navigate
Question Task 5.5

Why calibrate field-level confidence thresholds on a labeled validation set?

click or press Space to flip
Answer Task 5.5

Raw model confidence is not calibrated out of the box. Calibration against a labeled set ensures the threshold actually corresponds to the expected error rate for routing decisions.

← / → to navigate
Question Task 5.5

What two types of extractions should be routed to human review?

click or press Space to flip
Answer Task 5.5

Extractions with low model confidence AND extractions from ambiguous or contradictory source documents. Both signal cases where automated acceptance carries significant risk.

← / → to navigate
Question Task 5.6

What four fields should every structured claim-source mapping include?

click or press Space to flip
Answer Task 5.6

Source URL · document name · relevant excerpt · publication/collection date. These travel with each claim through synthesis — they cannot be reconstructed after the fact once stripped.

← / → to navigate
Question Task 5.6

Source A shows +14% revenue; Source B shows +9%. What does the synthesis agent do?

click or press Space to flip
Answer Task 5.6

Annotate both values with their sources and dates — do not select one arbitrarily. The coordinator decides reconciliation. The synthesis agent surfaces the conflict; it does not resolve it.

← / → to navigate
Question Task 5.6

Why must structured outputs include publication or collection dates?

click or press Space to flip
Answer Task 5.6

A Q1 figure vs. a Q4 figure may be a temporal update, not a factual contradiction. Without dates, the coordinator cannot distinguish a time-series difference from a genuine conflict.

← / → to navigate
Question Task 5.6

What content types require different rendering in synthesis output?

click or press Space to flip
Answer Task 5.6

Financial data → tables. News/qualitative findings → prose. Technical findings → structured lists. Converting everything to one format loses structure or nuance from the original source type.

← / → to navigate
Question Task 5.6

What is the "source attribution loss" failure mode in multi-source synthesis?

click or press Space to flip
Answer Task 5.6

A summarization step strips source URLs, document names, and excerpts from findings. The final report presents claims with no traceability — downstream consumers cannot verify or cite sources.

← / → to navigate

Keyboard: navigate · Space flip · S shuffle · R restart · G got it · V review again