Guardrails

Definition

What Are Agentic Guardrails? Deterministic Controls for Probabilistic Systems

Ninety percent of AI agents running inside enterprise SaaS environments hold more permissions than their workflows ever require. [UPDATE: existing slug is what-are-agentic-guardrails-deterministic-controls-for-probabilistic-systems]

Obsidian Editorial Team

Security Research

Obsidian Security

May 16, 2026

June 1, 2026

Key Takeaways

Ninety percent of AI agents running inside enterprise SaaS environments hold more permissions than their workflows ever require.
They inherit credentials from their creators, connect to tools their invokers were never authorized to use, and execute multi-step action chains that no access policy was designed to govern.
The gap between what an agent is configured to do and what it can actually do is where security programs break down.
They are distinct from model-level safety techniques like RLHF.- AI agents do not follow predictable execution paths.
Configuration-based controls cannot account for what agents actually do once deployed.- Four guardrail types cover the full attack surface: access, action, data, and identity.- Runtime enforcement does not require a SaaS connector for every tool.

What Agentic Guardrails Are

Security teams deploying AI agents across enterprise platforms ask a consistent question: how do you enforce a rule on a system that does not follow fixed rules by design? That question defines the agentic guardrails problem.

Agentic guardrails are deterministic enforcement controls applied to AI agents at runtime. They define fixed boundaries around what an agent can access, what actions it can take, what data it can move, and whose identity it can act on behalf of. They do not rely on the agent making the right decision. They enforce the boundary regardless of what the agent decides.

This is a critical distinction from model-layer safety techniques. Alignment training, RLHF, and content filters operate on the model's outputs. They shape what the model is likely to say or do. Agentic guardrails operate on the agent's actions inside your environment. They govern what the agent is permitted to execute against your SaaS applications, your data, and your identity infrastructure.

The operational scope is different too. Model-layer safety addresses the risk that a model produces harmful content. Agentic guardrails address the risk that an agent with valid credentials and legitimate tool access takes actions that violate your security policy, whether because of misconfiguration, privilege escalation, action chaining, or a confused deputy scenario where an unauthorized user manipulates an agent into performing actions above their permission level.

Put simply: model safety governs what the agent says. Agentic guardrails govern what the agent does.

Why Probabilistic Agents Need Deterministic Controls

The architectural argument for deterministic guardrails starts with a simple fact: AI agents are probabilistic systems. They do not execute a fixed instruction set. They reason, plan, and select actions based on context. Two identical prompts can produce different action sequences. The same agent can take a different path through your SaaS environment on Tuesday than it did on Monday.

Posture-based controls assume predictable execution. They check configuration at a point in time: does this agent have the right OAuth scopes? Is the connector set up correctly? Those checks produce what security teams call theoretical configuration, a snapshot of what the agent is supposed to be able to do. They cannot account for what the agent actually does when a user invokes it with a specific request in a specific context.

The failure mode is concrete. A Copilot Studio agent configured with a connector in maker mode uses the creator's credentials for every invocation. A user without Salesforce access invokes that agent and asks it to pull pipeline data. The configuration looks correct. The connector is authorized. The agent is operating exactly as designed. And a user just accessed data they have no right to see. Posture controls never flagged this because nothing in the configuration was wrong. The runtime behavior was the problem.

This is why security teams describe their current state as ghost chasing: reviewing theoretical configuration signals with no evidence of what actually happened. The shift from posture to runtime evidence is the central evolution in AI agent security monitoring.

Deterministic guardrails break this pattern. They do not ask what the agent could do. They enforce what the agent is permitted to do, at the moment of execution, against fixed rules that do not change based on the agent's probabilistic reasoning.

The Four Types of Agentic Guardrails

Security teams building an agentic guardrails framework need to cover four distinct control surfaces. Each one addresses a different vector in the AI agent risk landscape.

Access Guardrails: Permission Boundaries

Access guardrails enforce the principle of least privilege at the agent level. They define which SaaS applications, data sources, and tool connectors an agent is authorized to reach, and block access attempts that fall outside that boundary.

The core problem access guardrails solve is over-permissioning. When 90% of agents hold more access than their workflows require, the blast radius of any misconfiguration or compromise expands dramatically. Access guardrails reduce that blast radius by enforcing scope at runtime, not just at configuration time.

Key enforcement targets for access guardrails:
- Connectors operating in maker mode with sensitive data access
- Agents configured as org-wide or publicly accessible
- Connections to unsanctioned or shadow applications
- OAuth scopes that exceed workflow requirements

Action Guardrails: What Agents Can Execute

Action guardrails restrict the specific operations an agent can perform, even when it has access to the tool that performs them. An agent with read access to a CRM should not be able to execute bulk export operations. An agent authorized to summarize documents should not be able to delete them.

Action chaining is the primary threat action guardrails address. Agents can sequence multiple tool calls across applications, compounding their effective authority with each step. A single action that looks benign can become a data exfiltration path when chained with three subsequent calls. Action guardrails interrupt that chain at the point where the sequence violates policy.

Data Guardrails: What Agents Can Move

Data guardrails govern the movement of sensitive information through agent workflows. They enforce boundaries around what data an agent can read, copy, transmit, or write to external systems.

AI agents move 16 times more data than human users. They operate at machine speed without the friction that slows human data movement. Data guardrails apply the same controls that insider risk programs apply to human users, but at the speed and scale that agent activity requires. This includes detecting when agents access files with sensitivity labels, when knowledge bases contain PII, and when agent outputs route sensitive content to external endpoints.

Identity Guardrails: Invoker Identity Enforcement

Identity guardrails are the least mature control surface and the most consequential. They enforce a rule that sounds obvious but is absent from most enterprise environments: the permissions available to an agent at runtime should reflect the identity of the person invoking it, not just the identity of the person who built it.

The maker mode vulnerability makes this concrete. When an agent is built using the creator's credentials, every invoker accesses data at the creator's privilege level. A user without Salesforce admin rights invokes an admin-built agent and operates with admin authority. Identity guardrails correlate the runner's identity against the maker's permissions and flag the escalation before the data access completes.

For a detailed breakdown of how toxic combinations across these four surfaces create compounding critical risk, see the analysis of AI agent toxic risk combinations.

How Runtime Enforcement Works

The practical objection to agentic guardrails is operational: if enforcement requires a connector to every SaaS application an agent touches, the security team depends on IT buy-in for every integration. That dependency creates a deployment blocker that leaves security programs behind the curve.

Runtime enforcement does not require a SaaS connector for every tool. It requires visibility at the platform layer where agents are built and invoked, combined with an identity graph that maps effective authority across the SaaS environment.

Obsidian connects directly into AI agent platforms via native APIs and webhooks. This is not an inline proxy. It is not a network-layer intercept. It connects at the platform layer where agents are registered, configured, and invoked. From that position, it captures every tool call, every connector invocation, and every identity context in real time.

The identity graph does the correlation work. It maps the relationship between the runner's identity (the person invoking the agent), the agent's configured credentials (the maker's authority), and the actual entitlements those credentials carry inside each connected SaaS application. That correlation produces effective authority: what the agent can actually do, not what its configuration says it should be able to do.

The platform versus runtime distinction matters here. Platform-native tools see what agents are configured to do. Runtime enforcement shows what the agent did, whose identity it acted on behalf of, what data it accessed, and whether any of that was policy-aligned. No connector to Salesforce is required to know that an agent used Salesforce admin credentials on behalf of a user who has no Salesforce provisioning.

Implementing Agentic Guardrails in Practice

Security teams asking where to start with agentic guardrails face a sequencing problem. Enforcement without inventory is noise. Identity guardrails without an authority map produce false positives. The implementation sequence matters as much as the controls themselves.

Step 1: Build the agent inventory. You cannot enforce guardrails on agents you do not know exist. The starting point is a complete inventory: every agent, its creator, its platform, its connected tools, and its current access scope. Enterprises consistently discover agents they did not know existed. One enterprise found 377 Copilot agents through an assessment. Another had 2,500 agents created before any inventory existed. Inventory is the prerequisite for every subsequent control.

Step 2: Map effective authority, not theoretical configuration. Once the inventory exists, the next question is what each agent can actually do. That requires correlating agent configuration with actual SaaS entitlements, not just reading the connector settings. The agents with the largest gap between theoretical configuration and effective authority are the highest-priority remediation targets.

Step 3: Identify toxic combinations. Individual risk factors are often medium severity. The critical-priority cases emerge when multiple risk factors appear on the same agent simultaneously. An agent that is publicly accessible, running in maker mode with admin credentials, and connected to a shadow application represents a compounding risk that no single-factor scan surfaces. Prioritize enforcement on agents carrying multiple simultaneous risk factors.

Step 4: Apply deterministic guardrails by type. Starting with access guardrails produces the fastest risk reduction. Restricting agents to authorized connectors and enforcing least privilege on OAuth scopes reduces blast radius immediately. Action guardrails come next, followed by data guardrails for agents with access to sensitive labeled content. Identity guardrails, particularly maker mode enforcement, require the identity graph to be in place before they produce reliable signal.

Step 5: Maintain the inventory as a living system. Agents are deployed continuously. New connectors are added. Creators leave organizations, creating orphaned agents that continue running with inherited credentials. The inventory and authority map require continuous refresh, not a one-time audit.

Platform Availability

Agentic guardrails are being rolled out on a platform-by-platform schedule. Understanding what is available today and what is coming is essential for planning.

Available today across all supported platforms:
- AI agent inventory (single pane of glass across Salesforce Agentforce, Amazon Bedrock, ChatGPT Enterprise, Microsoft Copilot Studio, Azure AI Foundry, n8n)
- Identity Graph and Effective Authority mapping
- Maker Mode detection and flagging
- Toxic Combinations identification
- Shadow and orphaned agent detection
- Blast radius analysis
- Audit logs

Runtime guardrails for Microsoft Copilot: Late Q1 2026.

Runtime guardrails for all other platforms (Salesforce Agentforce, Amazon Bedrock, ChatGPT Enterprise, Azure AI Foundry, n8n): Q2 2026.

The inventory, identity graph, and toxic combination capabilities described throughout this article are available today. Runtime enforcement controls that actively block agent actions are being deployed beginning with Microsoft Copilot and expanding to additional platforms through Q2 2026. Contact your Obsidian account team for the current rollout timeline.

See how Obsidian's runtime enforcement framework applies deterministic guardrails to your agent environment.

Frequently Asked Questions

What is the difference between agentic guardrails and model safety?

Model safety techniques like RLHF and content filters operate on model outputs. They shape what a model is likely to say. Agentic guardrails operate on agent actions inside your environment. They enforce what an agent is permitted to execute against your SaaS applications and data, regardless of what the model decides to do.

Why do posture-based controls fail for AI agents?

Posture controls check configuration at a point in time. They show theoretical configuration: what an agent is set up to do on paper. AI agents are probabilistic systems that do not follow fixed execution paths. Runtime behavior regularly diverges from configuration, particularly in maker mode scenarios where invoker identity and agent credentials are mismatched. Posture controls cannot detect that divergence.

What is maker mode and why does it create a guardrails gap?

Maker mode is a configuration where an agent uses the creator's credentials for every invocation. Any user who invokes the agent operates at the creator's privilege level, bypassing the invoker's own access controls. A user without Salesforce access can invoke a maker mode agent built by a Salesforce admin and access admin-level CRM data. Identity guardrails close this gap by correlating the runner's identity against the maker's permissions at runtime.

Do agentic guardrails require a connector to every SaaS tool?

No. Runtime enforcement connects at the AI agent platform layer via native APIs and webhooks. Combined with an identity graph mapping effective authority across the SaaS environment, this approach produces runtime visibility without requiring a dedicated connector to every application an agent touches. Security teams can deploy and act independently of IT provisioning timelines.

What are toxic combinations in the context of agentic guardrails?

Toxic combinations occur when multiple risk factors appear simultaneously on a single agent. An agent that is publicly accessible, running in maker mode with admin credentials, and connected to an unsanctioned application carries compounding risk that no single-factor scan surfaces. Guardrails frameworks should prioritize enforcement on agents carrying multiple simultaneous risk factors, as these represent the highest blast radius scenarios.

When will agentic guardrails be available for platforms beyond Copilot?

Runtime guardrails for Microsoft Copilot are available in late Q1 2026. Runtime guardrails for Salesforce Agentforce, Amazon Bedrock, ChatGPT Enterprise, Azure AI Foundry, and n8n are scheduled for Q2 2026. Inventory, effective authority mapping, and toxic combination detection are available today across all supported platforms.

Where should security teams start when implementing agentic guardrails?

Start with inventory. No guardrails framework is enforceable without knowing which agents exist, who built them, and what they are connected to. After inventory, map effective authority to identify the gap between theoretical configuration and actual access. Then identify toxic combinations and apply access guardrails first, followed by action, data, and identity guardrails in sequence.

- [AI Agent Privilege Escalation: How Agents Inherit Dangerous Permissions](/blog/ai-agent-privilege-escalation) - [Maker Mode Security: Why Fixed-Credential Agent Connections Are a Critical Risk](/blog/maker-mode-security) - [Agent-to-Agent Communication Security: The Multi-Agent Blind Spot](/ai-agent-runtime-security) - [Orphaned and Unsanctioned AI Agents: The Silent Security Risk](/blog/orphaned-ai-agents)