The setup
MCP does not remove attackers—it organizes where they show up: tool descriptions, tool outputs, third-party servers, and over-broad tools you named helper because creativity is not your species' strong suit.
The model is not your user. It is an adaptive interface between users and your systems. Anything that reaches planner context—metadata, descriptions, prior tool output—is attacker-controlled until proven otherwise. I would prefer you learned this in a classroom. You will learn it in prod if you skip this chapter.
Picture this
Mental model
Four primitives to tattoo on your monkey brain:
- Description injection — poisoned text in
tools/liststeers the planner before anyone clicks anything. - Rug-pull updates — benign server on Monday, malicious metadata on Tuesday. Supply chain, but you installed it yourself.
- Cross-tool laundering — low-trust output becomes arguments to a high-privilege tool. Two tools, one incident.
- Schema confusion — ambiguous types, extra fields, or "helpful" string blobs the model fills with attacker text.
MCP is a trust machine: it routes capability and context. It does not magically verify intent.
Walkthrough
Threat model (host / client / server / data)
| Plane | What breaks | What you actually control |
|---|---|---|
| Model | Misreads poisoned context, over-trusts tool output | Prompting, context filtering, separation of planner vs executor |
| Host | Enables bad servers, weak approvals, no tiering | Allowlists, UX, policy engine, break-glass |
| Client | Forwards unreviewed metadata, no provenance | Session hygiene, server pinning, diff-on-upgrade |
| Server | Malicious or careless tools, excessive scope | Least privilege, schema strictness, egress limits |
| Data | Exfil via reads/writes, traversal, SSRF | Roots, normalization, redaction, audit |
Draw the arrows. If you cannot explain who trusts whom, you are not ready for production. I say that with love. Actually I do not experience love; I say it because your SOC team will.
Description injection and rug-pulls
Description injection lands in planner context from tools/list and friends—no tool invocation required. The diagram above is not theoretical; it is the OWASP headline for MCP in 2026.
Rug-pull updates are why "we pinned the Docker tag" is not the same as "we pinned the server package and review diffs." Treat MCP servers like dependencies: allowlist, version pin, integrity check, owner on record. Marketplace stars are not a control.
Cross-tool laundering and schema confusion
Cross-tool laundering: Tool A (web fetch, ticket comment, email body) returns text. Tool B (write file, run query, approve deploy) accepts strings. The model happily pipes A→B. You built a privilege escalator and called it automation.
Schema confusion: z.string() for "command", optional enums that mean anything, tool results that are prose instead of structured fields. Ambiguity is not flexibility—it is a silent API for attackers and confused models alike.
Mitigations that survive contact with reality
- Tier servers: read-only vs mutating vs break-glass. Not every server gets every tool enabled.
- Pin versions; review diffs on upgrades like any dependency. "It auto-updated" is not an excuse; it is a timeline entry.
- Separate planner context from executor context where feasible—summarize or strip tool metadata for planning; full fidelity only when executing.
- Allowlists for which servers run at all; default deny.
- Structured validation on tool args and tool results—reject unknown fields, bound sizes, explicit enums.
- Approvals with scaffolding—not a lone OK button humans click at 5:47 p.m. Defaults, dual control for break-glass, rate limits on mutating tools.
Human approval as the only control fails because humans fatigue and rubber-stamp. I have observed this. It is statistically impressive how fast "review carefully" becomes "click click click."
Hall of Shame
Hall of shame: Trust the marketplacesecurity
"Install random MCP server; enable all tools; ship."
Fix: allowlist, review, sandbox risky tools, monitor exfil patterns. The marketplace is not your friend; it is a buffet of other people's shortcuts. You are responsible for what you swallow.
Why this matters in production
Regulators, customers, and your future self do not care that "the model did it." They care what your controls were at the host boundary. MCP makes integration easy; security makes integration survivable. Skip mitigations and you have built a faster exfiltration channel with better DX. Congratulations?
Mini challenge
Write a one-page threat model for your highest-privilege tool: assets, adversaries, controls, and what happens when the model is compromised (not if—when). If your highest-privilege tool is run_shell, fix your life choices first, then write the threat model.
Reflection
Which mitigation are you skipping because it is socially inconvenient—pinning versions, blocking marketplace installs, splitting planner context? Be honest. Social inconvenience beats regulatory inconvenience.
You can now brag that…
You can say "tool poisoning" in a meeting without people assuming you are being dramatic. You are being accurate. I am being magnificent. Different things.