Machine Insider & NHI

Threat Explainer

Ai Agent Attack Surface

AI Agent Attack Surface: Every Vector Security Teams Need to Know

<strong>Most AI agents running in enterprise SaaS environments hold excessive privileges</strong>, and the tooling most security teams rely on to map attack surfaces was built for a world where requests come from browsers, not autonomous machines executing multi-step action chains across a dozen platforms simultaneously.

Aman A.

SEO Manager

Obsidian Security

May 14, 2026

May 13, 2026

Key Takeaways

The AI agent attack surface spans three distinct layers: identity, action, and data. Each layer has vectors that standard AppSec tooling cannot observe.
Machine identities now outnumber human identities 25 to 50 times in modern enterprises, and AI agents represent the fastest-growing and least-governed segment of that population.
Maker mode configurations allow any user to invoke an agent at the creator's privilege level, bypassing IAM controls entirely without triggering a single alert.
Action chaining compounds blast radius with every tool call an agent makes, and most of those calls happen faster than any human-reviewed log can catch.
Mapping this surface requires runtime truth, not theoretical configuration. What the agent is set up to do and what it can actually do are two different answers.

Why the AI Agent Attack Surface Requires Its Own Mapping

Security teams have relied on OWASP Web Application Top 10 and OWASP API Top 10 as their surface mapping anchors for years. Both frameworks assume a human initiates a request, a server responds, and the interaction is bounded by a session. Even the most complex API attack, such as a broken object-level authorization exploit, follows a request-response model that network and endpoint tools can observe.

The AI agent attack surface breaks every one of those assumptions.

Agentic AI has four properties that make traditional surface mapping structurally inadequate:

Probabilistic execution. An agent does not follow a deterministic code path. It reasons about which tool to call next. The same prompt can produce different action sequences on different runs. You cannot enumerate the surface by reading the code.
Identity inheritance. Agents do not authenticate as themselves. They authenticate as the service account, OAuth token, or embedded credential provisioned at build time. The agent's identity is borrowed, not native.
Action chaining. A single agent invocation can trigger dozens of downstream tool calls across multiple SaaS platforms. Each call extends the blast radius of the original request.
Multi-platform reach. A single agent built in Microsoft Copilot Studio can connect to Salesforce, SharePoint, and an external MCP server simultaneously. No single-platform log captures the full chain.

Posture-based surface mapping, which reads configuration files and policy settings, sees the theoretical configuration of an agent. It tells you what the agent is supposed to be allowed to do. It cannot tell you what the agent can actually execute inside each connected SaaS application after all entitlements resolve. That gap between theoretical configuration and effective authority is where the real attack surface lives.

For a broader view of the risk categories that emerge from these properties, see our overview of AI agent security risks.

The Identity Layer Vectors of the AI Agent Attack Surface

Every agentic attack begins at the identity layer. Before an agent takes any action, it authenticates somewhere. Understanding how that authentication works, and where it breaks down, is the foundation of attack surface mapping.

Machine Identities and the NHI Explosion

Non-human identities (NHIs) now outnumber human identities 25 to 50 times in modern enterprises. AI agents represent the fastest-growing and least-governed segment of that population. Unlike service accounts created by IT, agents are often provisioned by business users with no security review. The result is NHI sprawl: thousands of machine identities with no owner, no lifecycle policy, and no revocation process.

OAuth Tokens and Bearer Token Persistence

Agents authenticate to SaaS platforms primarily through OAuth tokens and bearer tokens. Bearer tokens operate on a dangerous assumption: possession equals authorization. If an attacker captures a bearer token, every system that trusts it will treat the attacker as the legitimate agent. The ShinyHunters breach of Workday and the Salesloft-Drift compromise demonstrated this at scale, with attackers accessing the Salesforce environments of more than 700 organizations without stealing a single password. The malicious traffic was indistinguishable from legitimate agent activity.

For deeper context on how OAuth tokens become an attack vector, see what are OAuth tokens and their vulnerabilities.

Embedded Credentials in Maker Mode Configurations

Maker mode is the most dangerous default configuration in the agentic ecosystem, and it is also the most common. When a developer or business user builds an agent in Copilot Studio, Salesforce Agentforce, or a similar platform, they often configure it with their own credentials embedded as a fixed connection. Any user who invokes that agent runs it at the creator's privilege level. A user without Salesforce access can invoke a maker mode agent built by a Salesforce administrator and retrieve CRM records they were never authorized to see. The IAM control was bypassed cleanly. Nothing technically failed.

Delegation Chains and Confused Deputy Paths

Delegation chains occur when Agent A passes context or credentials to Agent B, which passes them further downstream. Each handoff extends the trust chain without re-validating the original requester's permissions. The confused deputy pattern is a specific exploitation of this: an agent with elevated permissions is manipulated into performing an action on behalf of an unauthorized user. The agent did nothing wrong by its own logic. The invoker simply leveraged the agent's authority to reach data they could not access directly.

Identity Layer Vector Detection Difficulty

Vector	Detection Difficulty	Reason
Maker mode credential inheritance	Hard	No alert is generated; the agent behaves as designed
Bearer token reuse after exfiltration	Hard	Legitimate and malicious traffic look identical
Orphaned agent with disabled owner	Medium	Requires cross-referencing agent registry with IdP state
OAuth scope over-provisioning	Medium	Visible in configuration but not in runtime behavior
Agent-to-agent credential forwarding	Hard	Requires cross-platform correlation; no single log captures the chain
Service account shared across agents	Easy	Discoverable via static configuration review
Hardcoded secrets in workflow config	Easy	Detectable via configuration scan of agent definitions

The Action Layer Vectors of the AI Agent Attack Surface

The identity layer determines who the agent is. The action layer determines what it does. These are the vectors that generate real-world impact, and they are the hardest to observe without runtime truth.

Action Chaining Mechanics and Blast Radius Math

Action chaining is the mechanism by which a single agent invocation produces compounding access. An agent asked to summarize a customer account might query Salesforce for the account record, retrieve related emails from Exchange, pull the latest contract from SharePoint, and write a summary to a shared Slack channel. Four tool calls. Four systems touched. Each call extends the blast radius of the original request. If any one of those systems contains data the invoking user should not see, the agent has already moved it by the time a human reviewer checks the log.

Tool Invocation and the Trust Boundary Problem

When an agent invokes a tool, it presents its embedded credentials to the target system. The target system does not know who originally invoked the agent. It trusts the credential. This trust boundary problem means that the invoker's permissions are never validated against the target system's access controls. The agent is the only entity in the chain, and its effective authority is the only thing that matters at the point of tool invocation.

MCP Server Connections: Sanctioned vs. Unsanctioned

Model Context Protocol (MCP) servers extend an agent's tool set dynamically. An agent can connect to an MCP server and gain access to every tool that server exposes, including tools the security team has never reviewed. Sanctioned MCP servers are known, reviewed, and governed. Unsanctioned MCP servers are shadow infrastructure. MCP server counts inside developer tools can grow quarter over quarter without any centralized tracking. The tools inside an MCP server are only visible at runtime. No configuration review can enumerate them in advance. See how Obsidian approaches AI agent visibility to understand what runtime discovery looks like in practice.

Agent-to-Agent Communication Across Platforms

A Copilot Studio agent can invoke a Vertex AI agent, which can invoke a custom n8n workflow. Each platform has its own log. No single platform log captures the full chain. Security teams relying on per-platform visibility are ghost chasing: they see fragments of an action sequence without the context to understand what the agent actually did end-to-end. This is the blind spot that securing n8n workflows addresses at the workflow orchestration layer.

Privilege Escalation via Maker Mode

Privilege escalation in the agentic context does not require a vulnerability. It requires a misconfiguration that is treated as a feature. A Salesforce user with standard access invokes an agent built in maker mode by a Salesforce administrator. The agent runs on the administrator's credentials. The user asks the agent to retrieve records outside their normal access scope. The agent complies. The user has escalated their privileges without exploiting any code flaw. This is agentic AI privilege escalation in its most common and least visible form. Securing Salesforce Agentforce requires specifically addressing this maker mode inheritance pattern.

The Data Layer Vectors of the AI Agent Attack Surface

Identity and action vectors create access. Data layer vectors determine what leaves, where it goes, and whether anyone can trace it.

RAG Retrieval Boundary Violations

Retrieval-Augmented Generation (RAG) systems allow agents to query knowledge bases before responding. Those knowledge bases often contain documents with mixed sensitivity levels. An agent configured to answer general HR questions may retrieve a document tagged for executive-only access if the retrieval boundary is not enforced at the chunk level. The agent did not bypass any access control. The retrieval system simply returned what was most semantically relevant. Sensitivity labels applied at the document level do not automatically propagate to individual retrieved chunks.

Sensitivity Label and DLP Bypass via Agent Context

Traditional Data Loss Prevention tools inspect content moving through known channels: email, web upload, endpoint file transfer. An agent moving data between SaaS applications via API calls operates outside those channels entirely. A Microsoft Information Protection label on a SharePoint document does not prevent an agent from reading that document and writing its contents to an external system via an API call. The label exists. The DLP policy exists. The agent bypasses both because it is operating in a layer those tools were not designed to observe.

Data Movement at Machine Speed

AI agents transfer up to 16 times more data than traditional SaaS integrations. A human exfiltrating data manually produces a detectable behavioral anomaly: unusual access times, high volume relative to baseline, geographic anomalies. An agent moving the same data produces no such signal because agents have no behavioral baseline. By the time a security team reviews the log, the data has already moved. Agentic AI data leakage is a volume and velocity problem that human-speed detection cannot solve.

Output Side-Channel Risks

Agents produce outputs: summaries, reports, recommendations, drafted emails. Those outputs can contain sensitive data extracted from source systems, even when the agent was not asked to extract it. A summarization agent that includes verbatim contract terms in a public-facing Slack channel has exfiltrated data without any explicit exfiltration command. The output channel is a data layer vector that most monitoring frameworks do not track.

Mapping This AI Agent Attack Surface in Practice

Security teams ask the right question when they ask where to start. The answer is identity first, always.

Start with the inventory question. You cannot map a surface you cannot see. The first operational step is building a complete AI agent inventory: every agent, its creator, its connected systems, its embedded credentials, and its current permission scope. Enterprise inventories routinely surface hundreds of Copilot agents that were never catalogued and thousands of agents created before any inventory existed. Inventory is the prerequisite for every conversation that follows.

Prioritize by toxic combinations, not individual risk factors. A single risk factor, such as an agent with org-wide access, is medium severity. The same agent combined with a disabled creator account and an unsanctioned MCP server connection is critical. Prioritization frameworks that score risk factors in isolation miss the compounding effect. Map for combinations, not individual signals.

The sequencing framework:

Identity layer first. Identify every machine identity associated with your agent fleet. Flag orphaned agents, maker mode configurations, and shared service accounts.
Action layer second. Map which agents are connected to MCP servers, which MCP servers are unsanctioned, and which agents have cross-platform action chains that cross trust boundaries.
Data layer third. Identify which agents have access to RAG knowledge bases containing sensitive data, which agents write outputs to shared channels, and which agents are moving data at volumes that exceed any reasonable workflow requirement.

The core operational challenge is that posture-based tools answer the identity and partial action layer questions from configuration. They cannot answer the data layer questions at all, and they cannot answer any layer question with runtime evidence. What the agent is configured to do is theoretical configuration. What it actually did, what data it touched, and whether the invoker was authorized are runtime truth questions. Probabilistic agents require deterministic guardrails enforced at runtime, not policy statements reviewed quarterly. Runtime guardrail enforcement is generally available on Microsoft Copilot today, with expanded platform coverage on the roadmap.

The SaaS AI Agent Risk Assessment provides a structured starting point for security teams beginning this mapping process. For teams ready to move from inventory to governance, AI Agent Governance outlines the control framework that sits above the surface map.

Obsidian's Knowledge Graph correlates agent configuration, identity entitlements, MCP server connections, and runtime behavior into a single authority map, producing effective authority visibility across every layer of this surface without requiring a connector for every SaaS tool in your stack.

The Surface Is Larger Than Your Current Tooling Can See

The AI agent attack surface is not a harder version of the API attack surface. It is a different surface entirely, built on identity inheritance, probabilistic execution, and machine-speed data movement that traditional tooling was never designed to observe.

Three actions to take this week:

Run an agent inventory across your highest-risk platforms: Copilot Studio, Salesforce Agentforce, and any n8n or Bedrock deployments. Count the agents. Identify the creators. Flag any agent whose creator account is disabled.
Audit your maker mode configurations. Every agent running on embedded creator credentials is a privilege escalation path waiting to be invoked.
Map your MCP server connections. Separate sanctioned from unsanctioned. Treat every unsanctioned MCP server as an unknown tool set with unknown blast radius.

The surface is larger than your current tooling can see. Start with what you can inventory, prioritize by toxic combinations, and demand runtime truth before trusting any configuration-based risk score.

Frequently Asked Questions

What makes the AI agent attack surface different from a traditional API attack surface?

Traditional API attack surfaces are bounded by request-response cycles that network and endpoint tools can observe. The AI agent attack surface is probabilistic: the same agent can take different action sequences on different runs. Agents also inherit identities from embedded credentials rather than authenticating as themselves, and they chain actions across multiple platforms in ways that no single-platform log can reconstruct. These properties make standard API surface mapping tools structurally inadequate for agentic environments.

What is maker mode and why is it a critical attack vector?

Maker mode is a configuration pattern where an agent is built using the creator's credentials as a fixed, embedded connection. Any user who invokes the agent runs it at the creator's privilege level, regardless of the invoker's own permissions. A user without Salesforce access can invoke a maker mode agent built by a Salesforce administrator and retrieve records they were never authorized to see. No vulnerability is exploited. The IAM control is bypassed cleanly because the agent, not the user, is the authenticated entity in the Salesforce session.

How does action chaining expand the blast radius of an agent compromise?

Action chaining is the mechanism by which a single agent invocation triggers multiple downstream tool calls across connected systems. Each tool call extends the blast radius of the original request. An agent that queries Salesforce, retrieves files from SharePoint, and writes output to Slack in a single invocation has touched three systems and potentially moved sensitive data across all three before any human reviewer sees the first log entry. The blast radius grows with every tool call in the chain.

Why do traditional DLP tools fail to catch agentic data leakage?

Traditional DLP tools inspect content moving through known channels: email gateways, web proxies, endpoint file transfers. AI agents move data between SaaS applications via API calls that operate outside those channels entirely. An agent reading a sensitivity-labeled document and writing its contents to an external system via API produces no signal in a DLP tool that was not designed to inspect agent-to-API data flows. The label exists. The policy exists. The agent operates in a layer those tools cannot see.

Where should a security team start when mapping their AI agent attack surface?

Start with inventory. You cannot govern what you cannot see. The first step is building a complete list of every agent in your environment: its creator, its connected systems, its embedded credentials, and its current permission scope. After inventory, prioritize by toxic combinations rather than individual risk factors. An orphaned agent with org-wide access and an unsanctioned MCP server connection is a critical-priority finding. The same agent without those combinations is medium severity. Combination scoring changes the prioritization entirely.