Runtime Truth

Definition

AI Agent Threat Detection: Catching Runtime Risk Before It Becomes a Breach

Ninety percent of AI agents operating inside enterprise SaaS environments hold more access than their workflows actually require.

Obsidian Editorial Team

Security Research

Obsidian Security

May 21, 2026

June 1, 2026

Key Takeaways

Configuration-based tools show what agents are set up to do. Runtime detection shows what they actually do, and those two pictures rarely match.
The highest-risk patterns (privilege escalation, action chaining, orphaned agents, shadow agents) are invisible to any tool that does not observe agent activity at the moment it occurs.
AI agents are non-human identities. They move 16 times more data than human users and operate with 10 times more access than their workflows require, yet no legacy insider risk program covers them.
Effective AI agent threat detection requires correlating identity, effective authority, and tool-call activity into a single picture, not three separate dashboards.
Toxic combinations, where multiple risk factors stack on one agent, create critical-severity exposure that individual risk scores will never surface.

Why Configuration Is Not Reality

Security teams ask a reasonable question when a new AI agent appears in their environment: what is this agent configured to do? That question is the wrong starting point.

Configuration describes intent. Runtime describes reality. The gap between them is where breaches begin.

An agent built in Microsoft Copilot Studio may be configured to summarize meeting notes. Its actual effective authority, after all OAuth grants and connector permissions resolve, may include read access to the entire SharePoint tenant, write access to CRM records, and the ability to forward email on behalf of the creator. None of that shows up in the configuration summary. All of it is visible at runtime.

This is the core problem that legacy security tools cannot solve. Network and endpoint solutions cannot see AI agent activity across SaaS. Identity providers see the token grant but not the downstream action chain. Native platform logs exist but are siloed per tenant, require manual correlation, and do not show what an agent is authorized to do inside each connected application.

One enterprise security team described their situation precisely: they were ghost chasing, reviewing configuration signals that told them what could happen, with no evidence of what actually did happen. That is the state of most AI agent security programs in 2026.

Effective AI agent security monitoring starts by accepting that configuration is not reality, and then building detection on top of what agents actually do.

The Five Runtime Threat Patterns Security Teams Miss

AI agents introduce a distinct threat class. They are probabilistic systems: they do not follow deterministic scripts, they chain actions across tools, and they operate at machine speed with credentials that persist indefinitely. The threat patterns that matter most are not theoretical. They are observable at runtime, if you know what to look for.

1. Privilege Escalation via Maker Mode

An agent built in maker mode runs under the creator's credentials, regardless of who invokes it. A user without Salesforce access invokes that agent. The agent executes using the creator's admin-level Salesforce token. The user retrieves CRM records they were never provisioned to see. No authentication alert fires. No access policy was technically violated. The IAM controls were simply bypassed by design.

This is maker mode security risk in its most direct form. Detection requires correlating the runner's identity against the agent's inherited permissions, something no identity provider does natively.

2. Action Chaining

A single agent request triggers a sequence: read a SharePoint file, extract customer records, write to an external webhook, send a summary email. Each individual action looks routine. The chain, taken together, represents data movement at machine speed. Action chaining is the mechanism behind most AI-enabled data movement incidents, and it is only visible when tool-call sequences are observed in real time.

3. Orphaned Agents

An agent's creator leaves the organization. Their account is disabled. The agent continues running, indefinitely, under the disabled account's embedded credentials. Security teams rarely discover these agents until an incident forces a manual audit. One enterprise discovered 377 agents they did not know existed through a single assessment. Orphaned AI agents represent a machine insider risk that no offboarding checklist currently covers.

4. Shadow Agents and Shadow MCP Servers

Business users build agents on low-code platforms without IT or security involvement. Those agents connect to unsanctioned MCP servers, which in turn expose tool calls to unregistered external domains. The shadow AI problem has evolved: it is no longer just employees uploading files to personal AI accounts. It is autonomous agents running with broad OAuth grants, connecting to infrastructure security has never seen, operating entirely outside any governance boundary.

5. Agent-to-Agent Data Exposure

Agent A holds limited permissions. Agent B holds broad permissions. Agent A calls Agent B as part of a workflow. Agent B returns data that Agent A should never have been able to access. The cross-platform agent communication pathway is a blind spot for every single-platform monitoring tool. Detecting it requires a view that spans the entire agent network, not just one platform's logs.

For a deeper look at how these patterns combine into compounding risk, the AI agent visibility layer is where the full picture comes together.

Effective Authority: The Metric That Actually Matters

Most security tools measure theoretical configuration. Effective authority is what an agent can actually execute inside a SaaS application after all entitlements, OAuth grants, connector permissions, and inherited credentials resolve.

Those two numbers are almost never the same.

An agent may be configured to access customer data. Its effective authority may include read and write access to every Salesforce object in the org, the ability to trigger automated flows in system mode, and access to files labeled as confidential under the organization's data classification policy. The configuration description is accurate. It is also dangerously incomplete.

Effective authority is the correct unit of measurement for AI agent risk management because it answers the question that actually matters during an incident: what could this agent have done, and what did it actually do?

Mapping effective authority requires correlating agent configuration with entitlements from third-party applications, identity context, MCP server connections, and real-time tool-call activity. No single-platform log can produce that picture. It requires a correlation layer that connects all of those data sources into one authoritative view.

Non-human identities now outnumber human identities by 25 to 50 times in modern enterprises. AI agents are the fastest-growing NHI category. Every agent holds tokens and credentials exactly like a human insider, but no insider risk program covers them. That is the machine insider risk gap, and effective authority mapping is the mechanism that closes it. Learn more about what non-human identities mean for your security program.

Toxic Combinations and the Blast Radius Problem

Individual risk factors are manageable. Toxic combinations are not.

A single agent that is publicly accessible carries medium severity. A single agent whose creator account is disabled carries medium severity. A single agent with a connector running in maker mode carries high severity. Stack all three on the same agent, and the blast radius becomes critical: any unauthenticated user can invoke an orphaned agent running under a disabled admin's credentials, with no owner to remediate it.

Toxic combinations are the detection signal that most security tools miss because they score risk factors in isolation. Effective agentic AI security requires a detection layer that identifies when multiple risk factors stack on a single agent and escalates that combination to critical priority automatically.

Risk FactorIndividual SeverityCombined SeverityPublicly accessible agentMediumCritical (when combined)Creator account disabledMediumCritical (when combined)Connector in maker mode with sensitive accessHighCritical (when combined)Org-wide accessible agentMediumEscalates with any other factorHardcoded credentials in connectorHighCritical with public access

The blast radius concept applies directly here. An agent with narrow access and a single misconfiguration has a small blast radius. An agent with broad SaaS entitlements, public accessibility, and an orphaned owner has a blast radius that spans the entire connected data environment. Detecting the combination is what turns a medium-priority alert into an incident-level response.

Runtime Detection vs. Static Posture: The Core Distinction

Static posture detection answers one question: how is this agent configured? Runtime AI security detection answers a different question: what did this agent actually do, and was any of it anomalous?

The distinction matters because probabilistic agents deviate from their intended behavior. They do not follow scripts. An agent configured to summarize documents may, under the right sequence of inputs, begin reading files outside its intended scope, forwarding content to connected services, or executing tool calls that were never part of the original workflow design. Configuration tells you nothing about that deviation. Runtime observation catches it.

Static posture tools are useful for governance audits. They are not sufficient for runtime detection because they cannot detect what happens between configuration reviews. Agents run continuously. Incidents do not wait for quarterly audits.

The practical implication for security teams: detection must operate at the tool-call level, in real time, correlating each action against the agent's identity, its effective authority, and the downstream systems it touches. That is the runtime truth layer. It is the only layer that can catch privilege escalation, action chaining, and shadow agent activity before the data has already moved.

Deterministic guardrails extend this further: they apply fixed, predictable enforcement rules to dynamic, probabilistic agents at the moment of execution. Detection without the ability to enforce is expensive logging. The target state is detection that feeds directly into enforcement, blocking the action before the blast radius expands. Runtime enforcement of this kind is a roadmap capability; continuous detection and effective authority mapping are the present foundation it builds on.

Building a Single Pane of Glass for AI Agent Risk

The prerequisite for any effective AI agent threat detection program is inventory. You cannot detect threats to agents you do not know exist.

One enterprise had 2,500 agents created before any inventory existed. Another discovered 377 Copilot agents through a single assessment. These are not outliers. They are the baseline condition for most organizations in 2026. AI agents are deployed by business teams on low-code platforms, by developers connecting MCP servers to coding assistants, and by vendors embedding agentic workflows into SaaS products. Security teams inherit the risk without the visibility.

A single pane of glass for AI agent risk requires:

Complete agent inventory across all platforms (Copilot Studio, Salesforce Agentforce, Amazon Bedrock, n8n, ChatGPT Enterprise, and others), including agents security did not sanction.

MCP server inventory covering both sanctioned and shadow MCP connections, with visibility into what tools each server exposes.

Identity correlation mapping each agent to its creator, its runner population, and the credentials it uses at execution time.

Effective authority mapping showing what each agent can actually do inside each connected SaaS application.

Runtime tool-call visibility capturing what agents actually did, not just what they were configured to do.

That combination, delivered without requiring a SaaS connector for every application in the stack, is what separates connector-free AI runtime security from legacy approaches. Security teams should not need IT sign-off on every SaaS integration to get visibility into agent risk. The AI agent governance framework provides the structure for operationalizing this across the enterprise.

Actionable next steps:

Run an AI agent risk assessment to establish a baseline inventory of agents, owners, and effective authority across your environment.
Identify every agent whose creator account is disabled. Those are orphaned agents running with inherited credentials today.
Map all MCP server connections, sanctioned and unsanctioned, and flag any connecting to unregistered external domains.
Establish how runtime signals translate into actionable alerts in your environment.
Define your toxic combination thresholds: which stacked risk factors trigger critical-priority response in your environment.

Configuration is not reality. Runtime truth is the only foundation that holds.

Frequently Asked Questions

What is AI agent threat detection?

AI agent threat detection is the practice of identifying high-risk runtime patterns produced by autonomous AI agents operating inside enterprise SaaS environments. It focuses on what agents actually do at execution time, including privilege escalation, action chaining, and anomalous data movement, rather than what their configuration says they should do.

Why do traditional security tools miss AI agent threats?

Traditional tools are designed for human users and static application configurations. They cannot correlate an agent's runtime tool-call activity against its inherited credentials, its invoker's identity, and the effective authority it holds inside each connected SaaS application. That correlation is the core detection requirement for agentic AI risk.

What is a toxic combination in AI agent security?

A toxic combination is when multiple risk factors stack on a single agent simultaneously, creating compounding, critical-severity exposure. For example: an agent that is publicly accessible, whose creator account is disabled, and whose connector runs in maker mode with sensitive data access. Each factor alone is medium or high severity. Together, they represent a critical-priority incident risk.

What is maker mode, and why is it dangerous?

Maker mode is a configuration where an agent runs under the creator's embedded credentials, regardless of who invokes it. Any user who can reach the agent, including users without direct access to the underlying SaaS application, effectively operates at the creator's privilege level. This bypasses IAM controls entirely and is one of the most common privilege escalation vectors in enterprise AI deployments.

What is the difference between runtime AI security and static posture detection?

Static posture detection reviews how an agent is configured at a point in time. Runtime AI security observes what the agent actually does as it executes, capturing tool calls, data access, identity context, and action sequences in real time. Static posture tells you what could happen. Runtime detection tells you what did happen.

What are orphaned AI agents, and why are they a security risk?

Orphaned agents are agents whose creator or owner account has been disabled, typically through employee offboarding, but the agent continues running with the inherited credentials. They represent a machine insider risk because they hold active tokens with no accountable owner, no remediation path, and often broad SaaS access that was never revoked.