Runtime Truth

Feature Blog

Real-Time AI Agent Tool Execution Monitoring: The MCP Runtime Visibility Guide

Ninety percent of AI agents running in enterprise environments today hold more access than their workflows require.

Obsidian Editorial Team

Security Research

Obsidian Security

May 20, 2026

June 1, 2026

Key Takeaways

Ninety percent of AI agents running in enterprise environments today hold more access than their workflows require.
Security teams know those agents exist.
What they cannot answer is what those agents are actually doing right now: which MCP servers they are calling, whose credentials they are running under, and whether any of that activity aligns with policy.
That is a visibility gap, and configuration-based tools cannot close it.
Real-time AI agent tool execution monitoring is the capability that closes it.
Not by reviewing logs after the fact.

Why Configuration Visibility Leaves Security Teams Ghost Chasing

Security teams managing AI agent deployments face a specific frustration. Native platform dashboards show agent configurations. Audit logs exist. But when an incident occurs, or when a CISO asks what a specific agent accessed last Tuesday, the answer requires manual correlation across siloed logs from multiple platforms, identity providers, and SaaS applications. One enterprise discovered 377 Copilot agents through a security assessment their team had no prior record of. Another found 2,500 agents already created before any inventory process existed.

This is ghost chasing: spending security resources on theoretical configuration signals with no runtime evidence of what actually happened. The agent looks fine on paper. Its configuration shows expected connectors and scoped permissions. But the configuration does not tell you whether a user without Salesforce access invoked that agent and extracted CRM records using the creator's embedded credentials. Only runtime observation captures that sequence.

The core problem is structural. Posture-based tools read configuration state. They answer the question "how is this agent set up?" Runtime monitoring answers the question "what did this agent do, on whose behalf, using which credentials, touching which data?" Those are different questions. For AI agent monitoring, only the second question produces evidence that is operationally useful.

Probabilistic agents vs. deterministic guardrails: this contrast sits at the center of the problem. Agents do not follow a fixed script. They respond to context, user input, and chained tool calls no configuration document anticipated. Governing them requires observing what they actually do, not reviewing what their setup suggests they should do.

What MCP Runtime Monitoring Actually Captures

The Model Context Protocol (MCP) is an open standard that connects AI agents to external tools, data sources, and services. An agent using MCP can call a file retrieval tool, query a database, send an API request, or trigger a downstream workflow, all within a single user interaction. Each of those calls is a discrete event with an identity, a target, a payload, and an outcome.

MCP runtime monitoring captures those events as they occur. The mechanism works through direct integration with AI agent platforms via native APIs and webhooks. No network proxy. No inline traffic interceptor. The integration sits at the platform layer, capturing the full event stream: which agent was invoked, by which identity, at what time, which MCP tool calls were made, what data was accessed, and what the agent returned.

This matters for several reasons that static posture cannot address:

What Runtime Monitoring CapturesWhy Configuration Tools Miss ItLive agent invocations with caller identityConfig shows the agent exists, not who ran itMCP tool call sequences and targetsTool calls happen at runtime, not in config filesData access events with sensitivity contextAccess logs are siloed per platformMaker mode credential usage per invocationCredential inheritance is invisible without runtime correlationOrphaned agent activity (disabled owner, active agent)Config shows the agent; runtime shows it is still runningShadow MCP server connectionsUnsanctioned servers only appear when called

One point deserves emphasis. MCP interactions are only observable at runtime. The tools inside an MCP server, the sequences in which they are called, and the data they return do not exist in any configuration file. Retroactive log review can tell you that an agent ran. It cannot reconstruct which tool calls it made, in what order, or what data moved as a result. That reconstruction requires continuous capture from the moment the agent is invoked.

The Cross-Platform Blind Spot: Why Single-Pane Visibility Matters

Enterprises do not run one AI platform. They run several simultaneously. A workflow might originate in Microsoft Copilot Studio, invoke an MCP server that queries an n8n workflow, which in turn writes to a Salesforce record. The Copilot team sees the Copilot leg. The n8n administrator sees the workflow leg. Nobody sees the full action chain unless a single monitoring layer spans all three.

This is the cross-platform blind spot that makes agentic AI security genuinely difficult. Each platform produces its own logs in its own format with its own identity model. Stitching those logs together manually takes days, not minutes, and produces incomplete pictures because the correlation logic between platforms does not exist natively.

A single pane of glass for MCP runtime monitoring means one continuous view across supported platforms: Salesforce Agentforce, Amazon Bedrock, Microsoft Copilot Studio, Azure AI Foundry, n8n, ChatGPT Enterprise, and others. Every agent invocation, every MCP tool call, every data access event from every platform feeds into one correlated stream. Security teams stop asking which dashboard to check and start asking what actually happened.

The shadow agent problem compounds this. An agent can connect to an MCP server that was never registered with the security team. That server might expose tools with broad data access. The connection only becomes visible when the agent calls it at runtime. No configuration review surfaces it before that moment. Runtime monitoring is the only detection layer that catches shadow MCP servers as they are used, not after a breach surfaces them.

For teams securing n8n workflows or Salesforce Agentforce deployments, this cross-platform visibility is not a nice-to-have. It is the prerequisite for any meaningful security program.

Machine Insider Risk: The Identity Layer Runtime Monitoring Reveals

AI agents are non-human identities. They hold tokens, service accounts, and OAuth grants. They authenticate to SaaS applications and take actions on behalf of users. They are, in every functional sense, machine insiders. And they are invisible to every existing insider risk program.

The machine insider risk problem has a specific technical mechanism that runtime monitoring makes visible. When an agent is built in maker mode, it runs using the creator's credentials for every invocation, regardless of who the invoker is. A user without Salesforce access can invoke that agent and receive Salesforce data at the creator's privilege level. The user bypassed every IAM control. The agent did exactly what it was configured to do. Nothing in the configuration flags this as a problem.

Runtime monitoring surfaces this by correlating three data points at the moment of invocation: the runner's identity (who triggered the agent), the agent's embedded credentials (whose authority it is using), and the data accessed (what actually moved). That correlation produces the evidence that a privilege escalation occurred, not a theoretical risk that one could occur.

Two escalation scenarios are worth distinguishing:

Scenario 1: Cross-identity privilege escalation. A user without platform access invokes an agent that has embedded credentials providing that access. The user extracts data they were never authorized to see.
- What the control surfaces: invoker identity correlated against agent credential authority
- Why theoretical configuration fails: the configuration is technically functioning as designed; no policy is flagged as violated

Scenario 2: Within-platform permission escalation. A user with standard platform access invokes an agent built by an administrator. The agent runs under admin credentials. The user accesses records or performs actions beyond their own permission level.
- What the control surfaces: the delta between invoker entitlements and agent effective authority at runtime
- Why theoretical configuration fails: the creator's credentials appear in scope; the invoker's lower privilege level is invisible to posture tools

The orphaned agent problem follows the same logic. An agent whose creator account has been disabled continues running with the inherited credentials of that disabled account. Configuration shows the agent exists. Runtime monitoring shows it is still being invoked, still accessing data, still operating as a machine insider with credentials that should no longer be active.

For teams working through AI agent risk assessment, this identity correlation layer is what separates a security program from a compliance checkbox.

From Monitoring to Effective Authority: Closing the Visibility Gap

Most tools see theoretical configuration: what the agent is set up to do on paper. Effective authority is what the agent can actually do inside each SaaS application after all entitlements resolve. The gap between those two states is where the real blast radius lives.

Effective authority mapping requires correlating agent configuration with actual entitlements from connected SaaS applications, identity context from the identity provider, MCP server inventory (sanctioned versus unsanctioned), and the runtime event stream. No single data source produces this picture. A continuously updated correlation layer builds it by ingesting all four streams and resolving them into a living map of what every agent can actually execute, on whose behalf, with what downstream reach.

This is the operational intelligence layer that transforms raw monitoring data into security decisions. A risk score that says "this agent has elevated permissions" is useful. A risk score that says "this agent is running under a disabled account's credentials, is accessible org-wide, and has called an unregistered MCP server three times this week" is actionable. Toxic combinations, where multiple risk factors stack on a single agent, are only visible when the runtime stream feeds into a correlated authority model.

The path toward deterministic guardrails starts here. Probabilistic agents, by design, can deviate from intended behavior. They cannot be governed by probabilistic controls. Deterministic guardrails, fixed rules that enforce least privilege and block unauthorized action patterns, require a runtime monitoring foundation to operate against. You cannot enforce rules on activity you cannot see. Runtime monitoring is the prerequisite layer. Enforcement is the target state that runtime visibility makes possible.

For teams planning their rollout, the sequence matters: visibility first, authority mapping second, enforcement third.

Ready to see what your agents are actually doing? Request an AI agent risk assessment to establish your current inventory and identify monitoring gaps.

Building the Runtime Monitoring Practice: What Security Teams Need

Security teams evaluating MCP runtime monitoring capabilities should assess against five operational requirements:

1. Continuous capture, not periodic polling. MCP tool call sequences happen in seconds. Polling-based monitoring misses intermediate steps. The capture mechanism must be event-driven and continuous.
- What this enforces: complete event sequences including all intermediate tool calls within a session
- Why periodic polling fails: the window between polls is long enough for entire action chains to complete and leave no trace in an aggregated snapshot

2. Cross-platform correlation. Monitoring a single platform produces a partial picture. The value compounds when agent activity from Copilot Studio, Bedrock, and n8n feeds into one correlated stream.
- What this enforces: a single evidence record that spans multi-platform action chains
- Why single-platform monitoring fails: the same agent interaction can span three platforms; each platform's native log tells a fragment of the story

3. Identity resolution at the invocation level. Knowing that an agent ran is not enough. The monitoring layer must resolve who ran it, under which credentials, and whether the invoker's identity aligns with the agent's authority level.
- What this enforces: the correlation between invoker identity and agent effective authority that surfaces privilege escalation
- Why raw invocation logs fail: they record that the agent ran without attributing the action to the specific human identity that triggered it

4. MCP server inventory integration. Runtime monitoring should feed a live inventory of every MCP server contacted, flagging unsanctioned servers as they appear. Shadow MCP server discovery is a runtime function, not a configuration function.
- What this enforces: a current, complete list of every server the agent estate communicates with
- Why pre-deployment review fails: shadow MCP servers are connected after deployment, often by developers who bypass the registration process entirely

5. Connector-free deployment. Security teams should not need to become SaaS administrators to deploy monitoring. The capability should be deployable by the security team, independent of platform ownership.
- What this enforces: security team autonomy over the monitoring layer
- Why connector-dependent approaches fail: they create a dependency on SaaS platform administrators and delay deployment until IT sign-off clears

Teams currently relying on native platform logs face a specific scaling problem. Those logs are siloed, require manual correlation, and do not capture the cross-application identity relationships that make privilege escalation visible. The AI agent security risks that surface most frequently in enterprise environments, maker mode privilege escalation, orphaned agent activity, shadow MCP server connections, and agent-to-agent data exposure, are all runtime phenomena. They do not exist in configuration files. They exist in the live event stream that runtime monitoring captures.

Frequently Asked Questions

What is MCP runtime monitoring?

MCP runtime monitoring is the continuous capture of AI agent activity as it happens: agent invocations, MCP tool calls, data access events, and identity correlations. It produces runtime truth rather than theoretical configuration, showing what agents actually do rather than what their setup says they should do.

Why can't retroactive log review replace runtime monitoring?

MCP tool call sequences occur in seconds and involve multiple intermediate steps. Retroactive log review captures that an agent ran but cannot reconstruct the specific tool calls made, the order they occurred, or the data that moved during the session. Continuous runtime capture is the only mechanism that preserves that sequence.

What is the difference between MCP tool call monitoring and MCP runtime monitoring?

MCP tool call monitoring is a component of broader runtime monitoring, focused specifically on individual tool call events. MCP runtime monitoring is the encompassing capability: the continuous stream that captures agent invocations, identity correlation, data access, and tool calls together as a correlated event sequence across platforms.

How does runtime monitoring surface maker mode privilege escalation?

By correlating three data points at the moment of invocation: the invoker's identity, the agent's embedded credentials, and the data accessed. When the invoker lacks direct platform access but the agent's credentials provide it, runtime monitoring flags that gap as a privilege escalation event. Configuration tools cannot produce this correlation.

What are shadow MCP servers and why does runtime monitoring matter for finding them?

Shadow MCP servers are MCP server connections that were never registered with or approved by the security team. They only become visible when an agent calls them at runtime. No configuration review surfaces a shadow MCP server before it is used. Runtime monitoring detects the connection at the moment it occurs.

Does MCP runtime monitoring require owning or administering the SaaS platforms being monitored?

No. Connector-free runtime monitoring deploys at the security team level, independent of SaaS platform ownership. Security teams do not need IT or SaaS administrator sign-off to gain visibility into agent activity across supported platforms.