Agentic Architecture & Orchestration
CCA Foundations · Exam Prep Guide · ImagineX Consulting
Design and implement agentic loops for autonomous task execution
An agent is just a loop that keeps asking Claude what to do next until the job is done. Mechanically, it's a while loop around the Messages API: each response carries a stop_reason ("tool_use" = run a tool and loop again, "end_turn" = stop) that drives the next step.
When the model asks for a tool, you run it and send the output back as a user-role tool_result. Each result is tied to the request it answers by a matching tool_use_id — the pairing is by id, not order. Appending those results to the conversation is what lets the model reason about its next action instead of following a pre-wired decision tree — and a failed tool returns the same tool_result with is_error: true so the model can adapt.
Key Behaviors
- One response can include multiple
tool_useblocks — collect all results in a singleuserturn before the next API call; returning them one at a time serializes what should run in parallel "end_turn"→ exit — the only sanctioned termination signal- Conversation turns must strictly alternate: assistant turn (tool requests) then user turn (results) — making a second API call before appending results breaks the required alternation
- The model picks the next tool from context — your code never pre-programs a sequence
is_error: truekeeps the loop running — the model sees the failure and can retry with new parameters, pick a different tool, or report gracefully; the loop only exits whenstop_reason == "end_turn"
How It Works
Distractor Traps (common wrong answers)
- "parsing natural language signals to determine loop termination"
- "setting arbitrary iteration caps as the primary stopping mechanism"
- "checking for assistant text content as a completion indicator"
- Pre-configured decision trees or tool sequences instead of model-driven decision-making
Faded Example (agentic loop skeleton — Python / Anthropic SDK)
messages = [{"role": "user", "content": initial_task}]
while True:
response = client.messages.create(
model="claude-opus-4-8",
tools=tool_defs,
messages=messages,
)
# Loop's continue/stop decision belongs to stop_reason — never to prose.
if response.stop_reason == "end_turn": # terminate
break
# stop_reason == "tool_use" — execute every requested tool, collect results.
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = dispatch(block.name, **block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id, # matched by id, not position
"content": result,
})
# Append assistant turn, then feed results back as the next user turn.
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
Recommended Material
stop_reason, and the tool_use/tool_result round-tripOrchestrate multi-agent systems with coordinator-subagent patterns
Run a multi-agent research system on "AI's impact on creative industries" and you can get a polished report that covers only visual arts — missing music, writing, and film — even though every subagent did its own job perfectly. The culprit is overly narrow task decomposition by the coordinator, the hub that owns all routing, delegation, and error handling (subagents never talk to each other directly).
The fix is a coordinator that analyzes each query and dynamically selects which subagents to invoke rather than always routing through the full pipeline. It also has to pass each subagent the context it needs explicitly, because subagents run with isolated context and inherit nothing from the coordinator's conversation history.
Key Behaviors
- Subagent context is isolated — pass complete prior findings explicitly in each subagent's initial prompt
- Coordinator dynamically selects subagents per query, not always the full pipeline
- All inter-subagent communication routes through the coordinator (observability + consistent error handling)
- Iterative refinement: coordinator evaluates synthesis gaps, re-delegates with targeted queries, re-invokes synthesis
- Scope partitioning: assign distinct subtopics or source types per subagent to minimize duplication
How It Works
Distractor Traps (common wrong answers)
- "always routing through the full pipeline" instead of dynamically selecting subagents per query
- Blaming downstream agents (synthesis/search/analysis) when the root cause is the coordinator's narrow decomposition
- "not comprehensive enough queries" or "overly restrictive relevance criteria" as the root cause when coordinator scope is the problem
- Assuming subagents inherit the coordinator's conversation history automatically
Recommended Material
Configure subagent invocation, context passing, and spawning
To create a subagent, the hub agent — the coordinator — calls one built-in tool: Task. If "Task" isn't in its allowedTools, it can't spawn subagents at all.
Each new subagent starts with no parent history and no shared memory, so you pass context by putting the complete prior findings right in its prompt — structured with metadata (source URLs, document names, page numbers) so attribution survives the synthesis step. To run subagents in parallel, emit several Task calls in a single coordinator response; splitting them across separate turns serializes work that should be concurrent.
Key Behaviors
allowedToolsmust include"Task"— without it, the coordinator cannot spawn any subagents- Subagents have no parent context — include complete prior findings in the prompt explicitly
- Use structured data formats to separate content from metadata (source URL, document name, page number) so attribution survives synthesis
- Parallel spawning: emit multiple
Taskcalls in one coordinator response, not across sequential turns - Coordinator prompts specify goals and quality criteria, not step-by-step instructions (enables subagent adaptability)
How It Works
Distractor Traps (common wrong answers)
- Assuming subagents automatically inherit the coordinator's context or share memory between invocations
- Emitting
Taskcalls across separate turns instead of one response (serializes parallel work) - Passing unstructured findings (loses attribution metadata — source URLs, page numbers)
- Writing step-by-step procedural coordinator prompts instead of goal-plus-criteria prompts (reduces adaptability)
- Forgetting to add
"Task"to the coordinator'sallowedTools
Recommended Material
AgentDefinition, allowedTools, and exactly what a subagent inheritsImplement multi-step workflows with enforcement and handoff patterns
In production, an agent skips the get_customer step about 12% of the time and acts on the wrong account — because a system-prompt instruction is a request, not a guarantee. The fix is a programmatic prerequisite gate: code that refuses lookup_order and process_refund until get_customer has returned a verified customer ID, which eliminates that failure mode entirely. When a case must go to a human, compile a self-contained structured handoff (customer ID, root cause, refund amount, recommended action) — the human agent receiving it has no access to the conversation transcript.
Key Behaviors
- Programmatic prerequisite gates block downstream tools until prerequisites complete — block
process_refunduntilget_customerreturns a verified ID - Prompt instructions alone have a non-zero failure rate — use hooks or gates for deterministic step ordering
- Multi-concern requests: decompose into distinct items → investigate each in parallel with shared context → synthesize one resolution
- Structured handoff summaries must be self-contained: customer ID, root cause, refund amount, recommended action
How It Works
Distractor Traps (common wrong answers)
- "enhance the system prompt to state get_customer is mandatory" — prompt instructions remain probabilistic
- "add few-shot examples showing the agent calling get_customer first" — shapes behavior but doesn't enforce it
- "routing classifier that enables only the appropriate subset of tools" — addresses tool availability, not ordering
- Escalating with a transcript link instead of a structured self-contained handoff payload
Recommended Material
Apply Agent SDK hooks for tool call interception and data normalization
A hook is a piece of your own code that runs automatically every time a tool is called — it gives you guaranteed control where a system prompt can only make suggestions.
PostToolUse hooks run after a tool returns and transform the result before the model sees it — normalizing Unix timestamps, ISO 8601 dates, and numeric status codes from different MCP tools into one uniform format. PreToolUse hooks run before execution and can block a call outright: a refund above $500 never reaches the backend; the hook redirects it to human escalation.
The exam discriminator: hooks give guarantees; prompts give probabilities.
Key Behaviors
- Hook configuration requires a
matcherfield — a tool name or regex pattern that selects which tool calls trigger it; unmatched calls pass through untouched - A
PreToolUsehook writes its decision to stdout as JSON:{"decision": "block", "reason": "..."}stops the call; omittingdecisionallows it through - Choose hooks over prompt-based enforcement when business rules require guaranteed compliance
PostToolUseruns after — it cannot prevent tool execution; usePreToolUseto block calls before they happen
How It Works
Distractor Traps (common wrong answers)
- Choosing prompt-based enforcement when business rules require guaranteed compliance — prompts are probabilistic
- Using
PostToolUseto prevent a tool from executing — it runs after; usePreToolUseto block before execution - Normalizing heterogeneous data formats in the system prompt instead of a
PostToolUsehook - Treating "deterministic" and "probabilistic" compliance as equivalent when policy violations have financial consequences
Faded Example (hook configuration — Claude Code settings.json)
// Two hook types — one normalizes results, one blocks bad calls.
{
"hooks": {
"PostToolUse": [{ // runs AFTER the tool — rewrites the result
"matcher": "get_customer|lookup_order",
"hooks": [{ "type": "command", "command": "normalize_output.py" }]
}],
"PreToolUse": [{ // fill in: hook type that runs BEFORE a tool call
"matcher": "process_refund",
"hooks": [{ "type": "command", "command": "check_refund_policy.py" }]
}]
}
}
// What check_refund_policy.py writes to stdout when rejecting a call:
{
"decision": "block", // fill in: value that STOPS execution
"reason": "Amount > $500 — escalate to human"
}
Recommended Material
PostToolUse hook configurationDesign task decomposition strategies for complex workflows
Prompt chaining is a fixed sequential pipeline where each step's shape is known before the work starts. Dynamic decomposition generates the next subtasks from what the previous step discovered — it adapts as the investigation unfolds.
For a large multi-file code review, prompt chaining is the right fit: per-file local analysis passes first, then a separate cross-file integration pass. For open-ended investigation, dynamic decomposition maps structure first, identifies high-impact areas, then creates a prioritized plan that adapts as dependencies surface.
The failure mode the exam loves: running a single mixed pass over a large multi-file code review instead of splitting per-file + cross-file. Switching to a larger context-window model doesn't fix this — bigger context doesn't eliminate attention dilution; decomposition does.
Key Behaviors
- Prompt chaining: fixed sequential pipeline; use when the steps are predictable and the shape of the work is known ahead of time
- Dynamic decomposition: adaptive planning; use when the next subtask depends on what the previous step discovered (open-ended investigation)
- Split large code reviews into per-file local analysis passes + a separate cross-file integration pass to avoid attention dilution
- Decompose open-ended tasks by mapping structure first, identifying high-impact areas, then creating a prioritized plan that adapts as dependencies are found
- Choosing a larger context-window model is wrong — it does not fix attention dilution; decomposition does
How It Works
Distractor Traps (common wrong answers)
- Running a single mixed pass over a large multi-file code review instead of per-file + cross-file integration passes
- Switching to a larger context-window model to "give all files adequate attention in one pass" — bigger context does not fix attention dilution
- Using prompt chaining for open-ended investigation where subtasks must adapt to discoveries
- Using dynamic decomposition for predictable multi-aspect work where chaining is more reliable and auditable
- Treating decomposition as a one-size-fits-all pattern rather than selecting by workflow shape
Recommended Material
Manage session state, resumption, and forking
--resume <session-name> continues a named conversation where it left off; fork_session branches from a shared baseline like a git branch so you can explore two divergent approaches without either polluting the other; a fresh session with an injected structured summary is the safe path when prior tool results can no longer be trusted.
Use --resume when prior context and tool results are still valid. Use fork_session to test two different refactoring strategies or investigation paths from the same common starting point.
When files have changed or system state has shifted, neither pattern is safe — start fresh and inject a structured summary that replaces stale tool results with current facts. The trap: resuming after code changes without telling the agent what changed, leaving it to reason from outdated file reads.
Key Behaviors
--resume <session-name>: continues a named prior conversation; use when prior context and tool results are still validfork_session: creates an independent branch from a shared analysis baseline; use to explore divergent approaches from a common starting point- Fresh session + injected summary: start clean when prior tool results are stale; the structured summary replaces what the agent can no longer trust
- After file modifications, inform a resumed session about specific changes rather than requiring full re-exploration
- Treat
--resumeandfork_sessionas distinct: one continues the thread, the other branches it
How It Works
Distractor Traps (common wrong answers)
- Resuming after file modifications without informing the agent which files changed — it will reason from stale tool results
- Resuming when prior tool results are stale instead of starting fresh with a structured summary
- Forking a session whose baseline is itself out of date
- Treating
--resumeandfork_sessionas interchangeable — one continues, the other branches from a shared point
Recommended Material
--resume, fork_session, and injected summary patternsFurther Reading — Claude Docs
Official Anthropic documentation for the concepts in this domain.
--resume, fork_session)
Further Reading — Anthropic Academy on Skilljar
Optional self-paced courses.
Further Viewing — Peace Of Code (YouTube)
Watch if a topic is still unclear.
CCA Full Course — Peace Of Code