All ArticlesRuntime Truth
Access & Permissions
Threat Explainer

AI Agent Posture and Risk: Why Configuration Scoring Is Not Enough

Configuration scoring misses the real AI agent posture and risk. Learn how effective authority mapping and runtime truth reveal what static scans cannot.

Obsidian Editorial Team
Security Research
·
Obsidian Security
·
May 26, 2026
June 1, 2026
Key Takeaways
  • Configuration scoring tells you what an agent is set up to do. Effective authority tells you what it can actually execute inside every connected SaaS application.
  • AI agents are non-human identities that move up to 16 times more data than human users, yet most insider risk programs have no coverage for them.
  • Orphaned agents, maker mode credential inheritance, and agent-to-agent data exposure represent risk factors that no static scan can surface without runtime context.
  • Toxic combinations, where multiple medium-severity risk factors stack on a single agent, produce critical-priority exposure that individual configuration checks miss entirely.
  • Runtime truth requires correlating agent configuration, SaaS entitlements, identity context, and actual behavior into a single picture of effective authority.

The Ghost Chasing Problem: What Configuration Scoring Actually Measures

Security teams managing AI agent deployments face a specific frustration. They run configuration scans. They get findings. They chase those findings through remediation workflows. Then an incident occurs that the scan never flagged, because the scan was measuring policy, not behavior.

One enterprise security team described this experience precisely: they were ghost chasing, reviewing theoretical risks with no runtime evidence of what actually happened. The configuration said the agent had scoped permissions. The runtime showed the agent had accessed records the invoking user was never authorized to see.

Configuration scoring measures what an agent is set up to do at the moment the scan runs. It checks whether a connector exists, whether a permission scope is broad, whether an agent is publicly accessible. These are useful signals. They are not sufficient signals.

The mechanism behind the gap is straightforward. AI agents are probabilistic systems. They make decisions at runtime based on context, instructions, and the tools available to them. A static configuration check cannot predict which tool calls an agent will chain together, which credentials it will invoke, or which downstream SaaS application will receive a request it was never intended to receive. Configuration is a snapshot. Runtime is where risk becomes real.

For security engineers and CISOs trying to govern AI agent security across their environment, this distinction matters enormously. A posture score of "medium" on an agent that happens to be running with an administrator's embedded credentials and org-wide access is not a medium risk. It is a critical risk that the scoring model failed to surface.

Effective Authority vs. Theoretical Configuration

The core question in AI agent posture and risk assessment is not "what is this agent configured to do?" It is "what can this agent actually execute, on whose behalf, and with what downstream reach?"

Answering that question requires mapping effective authority: the actual access an agent holds inside every SaaS application it connects to, after all entitlements resolve. Theoretical configuration is what the policy file says. Effective authority is what happens when the agent runs.

The distinction surfaces most clearly in maker mode scenarios. Many enterprise agent platforms allow builders to create agents using their own credentials as the connection mechanism. When a lower-privilege user invokes that agent, the agent executes using the builder's credentials, not the invoker's. The invoker's IAM profile is irrelevant. The agent bypasses every access control the organization has applied to that user.

No configuration scan surfaces this risk accurately without correlating three separate data points: the invoker's identity, the agent's embedded credentials, and the entitlements those credentials carry inside the connected SaaS application. That correlation requires a single pane of glass across agent platforms and SaaS entitlements simultaneously. Most tools see one or the other. Neither view alone produces effective authority.

Non-human identities already outnumber human identities by 25 to 50 times in modern enterprises. Every AI agent deployment adds to that count. Traditional IAM programs were designed around human lifecycle events: onboarding, role changes, offboarding. None of those triggers apply to agents. Agents do not get quarterly access reviews. They do not trigger MFA challenges. They persist indefinitely unless someone actively decommissions them. Understanding non-human identity security for AI agents is now a prerequisite for any accurate risk assessment.

The Four Risk Factors That Configuration Scores Miss

Static configuration scoring catches some risks. The following four categories consistently fall outside its detection range.

Orphaned agents. An agent whose creator account has been disabled continues running with the credentials it inherited at creation. The creator's access is revoked from the identity provider. The agent's embedded token is not. Security teams at multiple enterprises have discovered dozens to hundreds of agents in this state during assessments. One organization found 377 Copilot agents they had no record of, many with disabled owner accounts. Configuration scans flag the disabled account. They do not correlate that status with the agent's continued operational access.

Maker mode credential inheritance. As described above, agents built on creator credentials expose every invoker to the creator's privilege level. This is not a misconfiguration in the traditional sense. The platform is functioning as designed. The risk exists in the relationship between the invoker's identity and the agent's embedded authority: a relationship that only becomes visible when you map both simultaneously.

Agent-to-agent data exposure. When Agent A (limited permissions) communicates with Agent B (broad permissions), Agent B may expose data that Agent A was never authorized to access directly. The request chain looks legitimate at each hop. No individual agent appears misconfigured. The risk lives in the interaction topology, the structure of how agents connect and delegate, not in any single agent's configuration. A growing body of security research argues that safety in agentic AI depends as much on this interaction topology as on individual model configuration, which directly challenges the sufficiency of per-agent configuration scoring.

Shadow agents. AI agents deployed without IT or security oversight do not appear in any sanctioned inventory. They cannot be scored if they are not known. Across enterprise customers, patterns consistently show that the number of agents discovered during a proper inventory exercise far exceeds what security teams believed existed. You cannot govern what you cannot see, and configuration scoring cannot score what it has never found.

For a deeper look at how these vectors combine, shadow AI security covers the compounding mechanics in detail.

Toxic Combinations: When Risk Stacks Reach Critical Severity

Individual risk factors often score at medium severity in isolation. The problem is that AI agents rarely carry just one risk factor. When multiple medium-severity conditions exist on the same agent simultaneously, the combined exposure reaches critical priority in ways that per-factor scoring never surfaces.

Consider this combination: an agent is publicly accessible (medium), its creator account is disabled (medium), and it holds a connector operating in maker mode with access to a sensitive data source (high). Scored individually, two of three findings are medium. Scored as a combination, this agent represents an unauthenticated, orphaned pathway to sensitive data with no active owner to respond to an incident.

This is what security teams mean when they describe AI agent security as a fundamentally different problem from traditional access governance. The risk is not in any single property. It is in the combination.

Effective AI agent posture and risk assessment requires a scoring model that identifies these toxic combinations explicitly, surfaces them as a distinct priority tier, and maps them to the specific data sources and SaaS applications within blast radius. A flat list of per-agent findings sorted by individual severity will consistently bury the most dangerous agents in the middle of the queue.

Blast Radius Prioritization: Ranking Agents by What They Can Actually Damage

Blast radius is the scope of damage a compromised or misconfigured agent could cause given its current entitlements. It is the right unit of measurement for prioritizing remediation in an environment where security teams cannot remediate every finding simultaneously.

An agent with broad OAuth scopes connected to a low-sensitivity internal wiki has a small blast radius. An agent with maker mode credentials connected to your CRM, HR system, and financial reporting platform has a blast radius that spans your most sensitive data categories. Both agents may carry similar configuration-level risk scores. Their actual risk profiles are orders of magnitude apart.

Blast radius prioritization requires knowing three things that configuration scoring alone cannot provide. First, what SaaS applications does this agent connect to, and what data classifications exist within those applications? Second, what is the agent's effective authority inside each connected application, not just its stated scope? Third, what is the agent's current operational status: active, dormant, or orphaned?

AI agents move up to 16 times more data than human users when operating normally. A compromised agent with broad entitlements does not move data at human speed. It moves data at machine speed, across every connected application, before any alert fires. Blast radius prioritization is how security teams identify which agents require immediate remediation versus which can follow standard ticketing workflows.

The AI agent risk assessment framework provides a structured approach to mapping blast radius across your agent inventory.

From Posture to Runtime Truth: What a Real Assessment Requires

A complete AI agent posture and risk program requires four sequential capabilities. Configuration scoring addresses part of the first. Runtime truth addresses all four.

Complete inventory. No assessment is meaningful without knowing every agent in your environment, including shadow agents deployed by business users without security oversight. This requires discovery that goes beyond sanctioned platform APIs to surface agents that were never registered with IT. Inventory is the prerequisite. Without it, every subsequent analysis has unknown blind spots.

Effective authority mapping. For each discovered agent, the assessment must map what the agent can actually do inside every connected SaaS application. This means correlating agent configuration with SaaS entitlements from the identity provider and the application itself, not just reading the agent's stated permission scopes. The maker mode correlation (runner identity versus builder credentials) is the specific mapping that most tools cannot produce.

Toxic combination scoring. Risk factors must be evaluated in combination, not in isolation. An assessment that scores each factor independently and presents a flat findings list will consistently misprioritize the most dangerous agents. Toxic combination detection requires a model that identifies when multiple risk factors on a single agent create compounding exposure.

Runtime evidence. The final and most critical layer is runtime truth: evidence of what agents actually did, what data they accessed, which tool calls they made, and whether any of that activity was policy-aligned. Configuration tells you what could happen. Runtime tells you what did happen. Security teams that rely solely on configuration are, in the precise language of their own experience, ghost chasing.

Probabilistic agents require deterministic guardrails. The path from posture assessment to enforcement starts with runtime visibility, knowing exactly what each agent did and what it is currently authorized to do. That visibility is the present-tense foundation on which any meaningful control layer must be built, and enforcement is the target state that follows from it.

To understand how this fits a broader program, AI agent governance covers where posture assessment sits alongside inventory, classification, and control.

Frequently Asked Questions

What is the difference between configuration scoring and effective authority in AI agent risk assessment?

Configuration scoring evaluates what an agent is set up to do based on its policy settings and stated permission scopes. Effective authority maps what the agent can actually execute inside every connected SaaS application after all entitlements resolve. An agent may have a moderate configuration score while holding embedded credentials that grant it administrative access to sensitive systems: a risk that only effective authority mapping reveals.

Why do orphaned AI agents represent a security risk?

An orphaned agent is one whose creator or owner account has been disabled, but the agent continues running with the credentials it inherited at creation. The creator's access is revoked at the identity provider level, but the agent's embedded token remains active. This creates an unowned, unmonitored access pathway with no active owner to respond if the agent is misused or compromised. This is machine insider risk with no human counterpart managing it.

What is a toxic combination in the context of AI agent posture and risk?

A toxic combination occurs when multiple risk factors exist simultaneously on a single agent, creating compounding exposure that exceeds what any individual factor would suggest. For example, an agent that is publicly accessible, has a disabled owner account, and operates in maker mode with sensitive data access represents a critical-priority risk even if each individual factor scores at medium severity in isolation.

What is maker mode and why does it create privilege escalation risk?

Maker mode refers to agents built using the creator's own credentials as the connection mechanism. When any user invokes the agent, it executes using the builder's credentials rather than the invoker's. A user with no direct access to a sensitive SaaS application can invoke a maker mode agent built by an administrator and effectively access data at the administrator's privilege level, bypassing every IAM control applied to the invoking user.

Why can't existing IAM tools govern AI agent risk?

Traditional IAM programs are designed around human identity lifecycle events: onboarding, role changes, and offboarding. AI agents do not trigger these events. They do not receive quarterly access reviews, do not authenticate interactively, and persist indefinitely without lifecycle management. Non-human identities already outnumber human identities by 25 to 50 times in modern enterprises, and AI agents represent the fastest-growing segment of that population. Existing IAM tools have no model for agent identity, delegation, or permission propagation.

What does blast radius mean for AI agent security?

Blast radius describes the scope of damage a compromised or misconfigured agent could cause given its current entitlements. An agent connected to low-sensitivity internal tools has a small blast radius. An agent with maker mode credentials connected to CRM, HR, and financial systems has a blast radius spanning the organization's most sensitive data. Because AI agents can move data at machine speed across all connected applications simultaneously, blast radius prioritization is the correct method for ranking which agents require immediate remediation.