Skippy's MCP
About
Chapter 10
mid
~32 min

Testing & Debugging MCP Servers — Make Failure Boring

If your only test is 'Claude liked it', you don't have tests. Obviously.

Spec 2025-11-25
SDK ^1.29
Reviewed 2026-05-14

Skippy estimates this chapter prevents 4 common MCP mistakes your team would otherwise ship on a Friday.

The setup

MCP servers fail in three places: negotiation, schema, and business logic. Test all three, in that order, cheapest first. “The model seemed fine” is not a test strategy—it is how biologicals ship Friday code and cry Monday.

Picture this

Test runner drives in-memory or fixture transport into server under test. No live model required—magnificent, I know.

Mental model

Treat your server like a library with a wire protocol façade—unit test the core, contract test the façade. The model is a terrible CI worker: non-deterministic, expensive, and prone to agreeing with you.

Walkthrough

  • Snapshot stable JSON outputs—not full timestamps unless frozen.
  • Add regression tests for bad model inputs (they will happen). Assume hostile strings; be pleasantly surprised when you are wrong.
  • Build a tiny “fake client” script for reproducing bugs. Ten minutes of harness beats three hours of “ask Claude what broke.”

Hall of Shame

Hall of shame: printf debugging on stdoutobservability

ts
console.log(JSON.stringify(req));

Fix: structured logs to stderr; redact secrets. Printing JSON-RPC to stdout is not debugging—it is sabotaging your own wire protocol. I would call it performance art, but your on-call would not laugh.

Why this matters in production

Incidents love non-reproducible bugs. Protocol-level tests turn “Claude weirdness” into failing CI. Boring failures are magnificent failures—they tell you exactly what regressed before a customer does.

Mini challenge

Write one test that asserts a tool rejects a path traversal payload. If you do not have such a test, you do not have a filesystem tool—you have a liability.

Reflection

What is the smallest harness that would have caught your last bug in 10 minutes? Build that next, not another dashboard panel.

You can now brag that…

You can debug MCP without asking the model what it thinks went wrong. You can still ask—I am not your supervisor—but do not believe the answer like gospel.

References

Quiz

  1. Golden tests for MCP servers should assert on…

  2. Why log to stderr in stdio servers?

  3. Flaky tests that depend on wall-clock timing are bad because…