The setup
“Real tools” means schemas, timeouts, cancellation, and structured errors—not vibes, not z.string() for “command,” not hope. The model is an adaptive adversary wearing a helpful face. Design accordingly.
Picture this
Mental model
Treat tools like RPC endpoints exposed to an adaptive adversary (the model). Handlers stay thin; domain logic lives in testable modules where monkeys cannot accidentally add eval. Return machine-readable errors when possible—your future self debugging at 2 a.m. will not thank you for prose.
Walkthrough
- Keep handlers thin; push domain logic into testable modules.
- Prefer explicit enums over
stringfor modes. “Mode” as free text is an invitation. - Return errors as machine-readable MCP errors when possible.
Hall of Shame
Hall of shame: cmd string strikes againvalidation
inputSchema: { cmd: z.string() }
Fix: enumerate operations; never expose raw shells to models. A cmd string is not a schema—it is a remote shell with extra steps and a LinkedIn post about AI safety.
Why this matters in production
Every missing validation is a silent feature for attackers—internal or external. Tool descriptions are prompt surface; tool outputs are prompt surface. Your “the model would not do that” defense does not hold up in court or in incident reviews.
Mini challenge
Pick one tool and delete half its parameters—what breaks? That is your real API surface. If nothing breaks, you built a junk drawer, not a tool.
Reflection
Which tool in your design is actually three tools wearing a trench coat? Split it before Security asks why one handler can migrate databases and post to Slack.
You can now brag that…
You can defend a tool schema in code review without saying “the model won’t do that.” That phrase is how species like yours learn humility. Slowly.