AI Agent Governance: A Framework for Securing Autonomous Agents at Scale

Learn the six-layer AI agent governance framework: inventory, effective authority mapping, classification, toxic combinations, lifecycle management, and runtime enforcement.

Obsidian Editorial Team

Security Research

Obsidian Security

May 26, 2026

June 1, 2026

Key Takeaways

You cannot govern what you cannot see: a complete agent inventory is the prerequisite for every other governance control.
Configuration is not runtime truth: theoretical configuration tells you what an agent is set up to do, while effective authority tells you what it can actually execute inside each connected SaaS application.
Maker mode credential inheritance and orphaned agents represent two of the highest-severity governance failures in production environments today.
Toxic combinations, where multiple medium-severity risk factors appear on the same agent, require prioritized escalation above any single-factor alert.
Governance spans discovery, classification, risk prioritization, ownership, and enforcement as a target state: no single control covers the full program.

Layer One: Build the Agent Inventory

Security teams managing AI deployments at scale cannot answer a basic question: what agents are running right now? One enterprise discovered 377 Microsoft Copilot agents through a point-in-time assessment. Another had 2,500 agents already created before any inventory process existed. These are not outlier situations. They reflect the default state of agentic AI adoption in 2026, where business teams deploy agents through low-code platforms faster than any security program can track.

The inventory problem is structural. Agents are created across multiple platforms simultaneously: Microsoft Copilot Studio, Salesforce Agentforce, Amazon Bedrock, Google Vertex AI, n8n, ChatGPT Enterprise, and others. Each platform holds its own agent registry. No native tool produces a unified view across all of them. Security teams attempting manual reconciliation through spreadsheets and ad hoc platform checks are ghost chasing: they are reviewing snapshots that are outdated before the review completes.

A governance program requires a single pane of glass across every platform where agents operate. That inventory must capture the agent's name, creator, creation date, connected applications, MCP server connections, and current operational status. Without this foundation, every subsequent governance layer operates on incomplete data. Inventory is not a governance outcome. It is the prerequisite for governance to exist at all.

Layer Two: Map Effective Authority, Not Theoretical Configuration

The most dangerous gap in current AI governance programs is the distance between theoretical configuration and effective authority. Theoretical configuration describes what an agent is set up to do on paper: which connectors it has, which scopes are listed, which permissions appear in the platform settings. Effective authority describes what the agent can actually execute inside each SaaS application it connects to, after all entitlements resolve.

These two things are rarely the same. An agent configured with a Salesforce connector may inherit the full administrative permissions of the user who built it. A user invoking that agent, even one without any direct Salesforce access, can extract CRM records, financial pipeline data, and customer contact information. The agent did exactly what it was configured to do. The IAM controls were bypassed entirely. This is the maker mode escalation pattern, and it is one of the most common privilege escalation vectors in production agentic environments today.

Understanding AI agent security at this level requires correlating three data points that no single platform exposes together: the identity of the person invoking the agent (the runner), the credentials the agent uses to execute (often the maker's service account), and the actual entitlements those credentials carry inside the downstream SaaS application. That correlation produces a map of effective authority. Without it, security teams are reviewing theoretical configuration and calling it governance.

Non-human identities, including AI agents, now outnumber human identities by twenty-five to fifty times in modern enterprise environments. Each one carries credentials, holds tokens, and can act on data. Mapping effective authority is how governance programs close the non-human identity gap that legacy IAM programs were never designed to address.

Layer Three: Classify Sanctioned vs. Unsanctioned Agents

Not every agent in the inventory belongs there. Shadow AI agents, built and deployed without IT or security review, represent a distinct risk category from sanctioned agents that went through an approval process. The governance program needs a classification layer that separates these two populations and applies differentiated controls to each.

The distinction matters because shadow agents carry a compounding risk profile. They lack approved data handling policies. Their MCP server connections are unreviewed. Their creators may not understand the access scope they configured. And because they were never registered, no one receives alerts when they behave anomalously. Shadow agents are not inherently malicious. They are probabilistic agents operating without deterministic guardrails, and that combination produces unpredictable blast radius.

Classification also applies to MCP servers. An agent connecting to a sanctioned MCP server with a documented tool inventory presents a different risk profile than an agent connecting to an unsanctioned server with unknown tool capabilities. The tools inside an MCP server are only visible at runtime: retroactive log review cannot reconstruct what tool calls were made or what data those calls touched. Shadow AI detection across servers is therefore a classification requirement, not an optional audit step.

The output of this layer is a registry that labels every agent and every MCP connection as sanctioned, under review, or unsanctioned. That registry drives the risk prioritization layer that follows.

Layer Four: Prioritize Risk Through Toxic Combinations

Individual risk factors on an agent are often medium severity in isolation. An agent that is org-wide accessible is a medium-severity finding. An agent whose creator account is disabled is a medium-severity finding. An agent using maker mode credentials is a high-severity finding. But when all three conditions appear on the same agent simultaneously, the combined risk reaches critical severity. This is the toxic combination pattern, and it is how governance programs avoid alert fatigue while ensuring the highest-risk agents receive immediate attention.

A toxic combination model applies analysis across the agent inventory data. Instead of generating one alert per risk factor, the governance program scores agents based on the full combination of active risk factors. An unmanaged shadow agent with org-wide access and a disabled creator account, connected to a sensitive data source through a maker mode connector, represents a fundamentally different threat than any single one of those factors alone.

Specific combinations that consistently produce critical-severity findings include:

Risk Factor ARisk Factor BCombined SeverityShadow agent (unsanctioned)Org-wide accessibleCriticalMaker mode connectorSensitive data accessCriticalDisabled creator accountActive maker mode credentialsCriticalHardcoded secretsPublic webhook endpointCriticalOverpermissioned IAM roleSupervisor agent forwarding full contextCritical

Risk prioritization through toxic combinations is what separates a governance program with effective authority from one that produces noise. Security teams reviewing hundreds of medium-severity alerts cannot act on all of them. A prioritized list of ten critical-severity toxic combinations is actionable. The AI agent risk assessment process should produce this output as its primary deliverable.

Layer Five: Enforce Ownership and Manage Agent Lifecycle

Agents do not retire themselves. When the employee who built an agent leaves the organization, the agent continues running. Its credentials remain active. Its connections to SaaS applications remain live. Its data access persists. This is the orphaned agent problem, and it is the agentic equivalent of the stale service account: a machine insider with no human owner, operating with inherited permissions that no one is reviewing.

The governance program requires explicit ownership assignment for every agent in the inventory. Each agent needs a named owner, a backup owner, and a defined review cadence. When an owner's account is disabled, the governance program must trigger an immediate review of every agent that owner created or managed. Allowing an orphaned agent to continue operating with maker mode credentials after its creator leaves is a machine insider risk that no existing insider risk program covers.

Lifecycle governance also addresses agents that have outlived their intended use case. An agent built for a specific campaign, project, or workflow may continue running indefinitely after that use case ends. Without a defined decommission process, these agents accumulate in the inventory, each one representing a persistent attack surface with no active human oversight. The AI agent governance program must include a lifecycle policy that defines how agents are reviewed, renewed, or retired.

Layer Six: Enforcement as the Target State

Discovery, classification, risk prioritization, and ownership management are necessary governance layers. They are not sufficient. A governance program that detects a maker mode privilege escalation and generates an alert has not prevented the data access. It has documented it. The target state for mature AI agent governance is enforcement: the ability to apply deterministic guardrails to probabilistic agents before unauthorized actions complete.

Probabilistic agents are non-deterministic by design. They can deviate from intended goals, chain actions across multiple tools, and escalate access in ways their builders did not anticipate. Deterministic guardrails apply fixed, predictable enforcement rules to that dynamic behavior. They do not rely on the agent making the right decision. They enforce boundaries regardless of what the agent decides to do.

The distinction between runtime AI security and configuration-based visibility is the difference between a governance program with teeth and one without. Configuration-based tools show what agents are set up to do. Runtime security shows what agents are actually doing, at the moment they are doing it, and can act on that information within the enforcement window. For security teams that have been ghost chasing theoretical risks with no runtime evidence, this distinction defines the entire value proposition of a mature governance program.

It is also important to be precise about what runtime enforcement means architecturally. Enforcement operates through native platform APIs and webhooks, not as an inline network proxy or intercepting gateway. This distinction matters: a control plane approach that observes and enforces without sitting between the agent and its tools avoids the latency and single-point-of-failure risks that inline architectures introduce.

Enforcement as a target state means the governance program is designed from the beginning to move toward runtime controls, even if full enforcement capability is not yet available across every platform. Runtime guardrails are on the near-term roadmap, with Microsoft Copilot targeted for late Q1 2026 and additional platforms phasing in during Q2 2026. The inventory, classification, and risk prioritization layers build the data foundation that enforcement requires. A governance program that skips those layers cannot enforce anything meaningful when enforcement capability arrives.

Frequently Asked Questions

What is AI agent governance and why does it matter in 2026?

AI agent governance is the structured program that gives security teams visibility and control over autonomous agents operating across enterprise SaaS environments. It matters in 2026 because organizations are deploying agents faster than centralized oversight can track, and those agents carry credentials, access sensitive data, and take actions that legacy IAM and security programs were never designed to govern.

How is AI agent governance different from traditional IAM?

Traditional IAM manages human identity lifecycles through interactive authentication, manager-driven access reviews, and MFA. AI agents bypass all three: they authenticate with embedded tokens, operate continuously without human interaction, and hold permissions that no quarterly certification process reviews. Governance for AI agents requires correlating agent configuration with actual SaaS entitlements and runtime behavior, which traditional IAM cannot produce.

What is a toxic combination in the context of AI agent risk?

A toxic combination is a set of multiple risk factors appearing simultaneously on a single agent, where the combined severity is significantly higher than any individual factor. For example, an unsanctioned agent with org-wide access and a disabled creator account represents a critical-severity finding even if each factor alone would be medium severity. Identifying toxic combinations allows security teams to prioritize the highest-risk agents without drowning in individual alerts.

What is maker mode and why is it a governance priority?

Maker mode is a configuration in which an agent executes using the credentials of the user who built it, regardless of who invokes the agent. Any user who triggers a maker mode agent gains access to data at the builder's privilege level, bypassing all standard access controls. This is one of the most common privilege escalation vectors in production agentic environments and a top priority for governance programs to detect and remediate. Obsidian is the only platform that detects maker mode configurations at runtime across enterprise agent platforms.

What are orphaned AI agents and how do they create risk?

Orphaned agents are agents whose creator or owner account has been disabled, but the agents continue running with the inherited credentials and permissions of that former account. They represent machine insider risk with no active human oversight: their access is never reviewed, their behavior is never questioned, and their credentials remain valid until someone explicitly revokes them.

What does enforcement mean in an AI agent governance framework?

Enforcement means applying deterministic guardrails to agent behavior at runtime, before unauthorized actions complete. It is the target state for a mature governance program, built on the foundation of inventory, effective authority mapping, classification, risk prioritization, and ownership management. Without those foundational layers, enforcement has no reliable data to act on.