AgentCore

Chapter 9

boss

~40 min

Observability and Evaluation

CloudWatch GenAI metrics, traces, and golden evaluation sets.

LangGraph ^0.4

Python >=3.11

observability

Reviewed 2026-05-16

Reading this chapter helps prevent 4 common multi-agent mistakes.

Overview

CloudWatch GenAI metrics, traces, and golden evaluation sets.

Observability and Evaluation architecture.

Key ideas

LangGraph owns orchestration: explicit state, nodes, and conditional edges.
AgentCore owns production concerns: runtime hosting, memory, identity, gateway, observability.
MCP standardizes tool surfaces so workers do not hard-code every backend integration.

Labs 4–9 in labs/agentcore/ follow a real AWS progression. Read the lab README for IAM and cost notes.

Walkthrough

Model the workflow as a graph: who plans, who executes tools, who summarizes.
Attach MCP tools through Gateway (or local mocks in early labs).
Add Memory for session continuity and Identity before calling protected APIs.
Instrument traces and run golden-set evals before promoting changes.

References

Quiz

What is the primary focus of Observability and Evaluation?
Which pattern routes work between specialized agents?
Why expose tools through AgentCore Gateway?

← AgentCore Gateway + MCP Capstone: Multi-Agent Triage →