Visibility & Shadow AI

Threat Explainer

Agent-to-Agent Communication Security: The Multi-Agent Blind Spot

Security teams have spent two years building controls around individual AI agents.

Obsidian Editorial Team

Security Research

Obsidian Security

May 16, 2026

June 1, 2026

Key Takeaways

Security teams have spent two years building controls around individual AI agents.
Agent-to-agent communication is the new lateral movement layer in enterprise environments, and no existing tool sees it completely.
When a supervisor agent hands off a task to a sub-agent running on a different platform, the invoker's identity drops, credentials forward silently, and the audit trail splits across three separate log systems.
Each platform records its own slice.
This is the visibility gap that defines agent-to-agent communication security.
Enterprises running Microsoft Copilot Studio alongside Amazon Bedrock, n8n, and Salesforce Agentforce are operating multi-agent architectures whether they planned to or not.

What Agent-to-Agent Communication Is

Most security teams still think about AI agents as discrete, isolated processes. That model is already obsolete.

Modern enterprise AI deployments use multi-agent architectures, where one agent orchestrates others to complete complex, multi-step tasks. A supervisor agent receives a user request, breaks it into subtasks, and delegates each subtask to a specialized sub-agent. The sub-agents may run on entirely different platforms, use different identity contexts, and connect to different SaaS applications. The user who initiated the request has no visibility into what happens after the first handoff.

Supervisor and Sub-Agent Patterns

The supervisor/sub-agent model is the dominant architecture for enterprise agentic workflows today. A supervisor agent, often built in Microsoft Copilot Studio or Amazon Bedrock, holds the orchestration logic. It decides which sub-agents to invoke, what context to pass, and how to assemble the final response. Sub-agents handle execution: querying a CRM, writing to a database, calling an external API, or invoking another agent.

The security problem is structural. The supervisor agent was provisioned with certain permissions. Each sub-agent was provisioned separately, often by a different team, with its own set of credentials and OAuth grants. When the supervisor calls a sub-agent, it passes context. That context frequently includes data the sub-agent was never independently authorized to receive.

A2A Protocols

The industry is formalizing agent-to-agent communication through emerging protocols. Google's Agent-to-Agent (A2A) protocol defines standard mechanisms for agents to discover each other, exchange tasks, and return results. These protocols enable interoperability across platforms. They also create standardized pathways for context, credentials, and instructions to flow between agents without any human in the loop.

Standardization accelerates adoption. It also standardizes the attack surface. When agents communicate through a defined protocol, the protocol itself becomes a target. A sub-agent that trusts any message arriving via an A2A-compliant channel will execute instructions from any sender that can reach it through that channel.

Cross-Platform Bridges

In multi-agent architectures, one agent's tool connections can bridge to another agent's tool connections, creating chains of tool access that extend far beyond what any single agent's configuration describes. The tools available to an agent are only fully visible at runtime, not from configuration alone. When agents bridge to each other across platforms, the blast radius of any single misconfigured agent expands to include every downstream connection.

Understanding the full AI agent attack surface requires mapping these chains, not just cataloging individual agents.

The Five Security Failures in Multi-Agent Architectures

Security teams ask why their existing controls do not catch multi-agent risks. The answer is that multi-agent architectures produce five structural failures that no single-platform tool was designed to detect.

The Five Failures

Context bleed occurs when a supervisor agent forwards its full conversation context to a sub-agent. Amazon Bedrock's supervisor agent pattern, for example, can forward the entire conversation history, including sensitive data from earlier turns, to every sub-agent it invokes. The sub-agent receives information it was never independently authorized to access. Nothing in the platform flags this as a violation because, from the platform's perspective, a legitimate agent made a legitimate call.

Credential forwarding occurs when tokens and OAuth grants travel through agent chains. An agent provisioned with admin-level Salesforce access passes its token downstream to a sub-agent handling a narrow data lookup task. The sub-agent now operates with credentials scoped far beyond its workflow requirements. This is the machine insider risk equivalent of a contractor receiving a master key because the person who hired them had one.

Trust transitivity is the logical consequence of how agents establish trust. If Agent A trusts Agent B, and Agent B trusts Agent C, Agent A implicitly trusts Agent C through the chain. No explicit trust decision was made about Agent C. No security team reviewed that relationship. The trust was inherited through the architecture. This mirrors the confused deputy problem, now operating at the identity layer across SaaS platforms.

Identity loss happens at handoffs. When a supervisor agent invokes a sub-agent, the original invoker's identity, the human who started the workflow, frequently does not propagate. The sub-agent sees the supervisor agent's identity as the caller. Downstream systems cannot determine whether the original request came from an authorized user or from an automated process operating outside any human's awareness.

Audit fragmentation is the operational consequence of all four failures above. Each platform logs its own events. Microsoft logs what Copilot Studio did. Salesforce logs what Agentforce did. Bedrock logs what its agents did. No platform logs the chain. Incident response teams trying to reconstruct what happened must manually correlate logs across systems that use different identity formats, different timestamps, and different event schemas.

Failure Summary Table

FailureMechanismDetection DifficultyContext bleedSupervisor forwards full conversation history to sub-agentHigh: no platform flags authorized-but-excessive data sharingCredential forwardingOAuth tokens and service account credentials passed through chainsHigh: token usage looks legitimate at each hopTrust transitivityImplicit trust inherited through agent-to-agent relationshipsVery High: no explicit trust decision is loggedIdentity lossOriginal invoker identity drops at agent handoffsVery High: sub-agents see calling agent, not original humanAudit fragmentationEach platform logs its own slice; no cross-chain correlationExtreme: requires manual correlation across incompatible log formats

These failures do not require an attacker. They occur in correctly functioning multi-agent systems by design.

How Trust Propagates Across Agent Chains

The trust chain problem in multi-agent communication security is not analogous to network lateral movement. It is more dangerous, because it operates at the identity layer with legitimate credentials.

The Trust Chain Problem

Every agent in a chain inherits the trust decisions made by the agents above it. A sub-agent does not independently verify whether the supervisor agent's request is policy-aligned. It executes the instruction because the instruction arrived from a trusted caller. This is the same assumption that makes bearer tokens dangerous: possession implies authorization.

When an enterprise deploys a supervisor agent with broad SaaS access to orchestrate ten specialized sub-agents, each sub-agent operates under the implicit assumption that any instruction from the supervisor is legitimate. A misconfigured upstream agent can exploit that assumption. The sub-agent has no mechanism to distinguish a legitimate orchestration request from a manipulated one.

One Compromised Agent Compromises All Downstream

The blast radius calculation for multi-agent architectures is multiplicative, not additive. A single compromised agent at the supervisor level exposes every sub-agent's capabilities and every data source those sub-agents can reach. A single misconfigured sub-agent with excessive permissions extends that blast radius to data the supervisor was never supposed to access.

The supply chain breach pattern demonstrates this dynamic at scale: a single compromised integration point can propagate access across hundreds of organizations. Multi-agent architectures create the same propagation dynamic inside a single enterprise, across platforms, at machine speed.

Cross-Platform Amplification

The risk amplifies when agents span platforms. A Copilot Studio supervisor invoking a Bedrock sub-agent crosses two identity systems, two permission models, and two audit logs. The effective authority of the combined chain is not the intersection of each agent's permissions. It is the union. Whatever the most permissive agent in the chain can access becomes reachable through the chain.

This is why agentic AI security requires cross-platform visibility. Per-platform tools see their own agent's behavior. They do not see what that agent enables downstream.

Detection Challenges

Security teams know something is wrong when they try to answer a basic question: what did this agent actually do, and who authorized it?

Per-Platform Logs vs. Cross-Platform Chains

Every major AI platform generates logs. Microsoft Copilot Studio logs agent invocations. Amazon Bedrock logs model calls. Salesforce Agentforce logs record accesses. Each log is accurate within its own boundary. None of them captures the chain.

Reconstructing a multi-agent workflow from per-platform logs requires correlating events across systems that do not share a common identifier for the chain. The supervisor agent's session ID in Copilot Studio does not appear in Bedrock's logs. The sub-agent's action in Salesforce does not reference the original user who triggered the workflow. Security teams end up ghost chasing: assembling a picture of what might have happened from fragments that do not connect.

Identity Context Loss at Handoffs

The identity problem compounds the log problem. When an agent handoff occurs, the original invoker's identity frequently does not propagate. A security analyst reviewing Salesforce logs sees the agent's service account as the actor. They cannot determine whether a human triggered the workflow, which human triggered it, or whether that human had any authorization to access the data the agent retrieved.

This is the visibility gap that makes agent-to-agent communication security fundamentally different from traditional access control problems. Traditional IAM assumes a human identity is always traceable. Multi-agent architectures break that assumption at every handoff.

No Single Tool Sees the Full Chain

The tools enterprises already own, including native platform dashboards, SIEM integrations, and identity governance platforms, were built for human-centric access models. They log individual events. They do not correlate cross-platform agent chains into a single authority map.

The AI agent governance problem is not a data problem. Logs exist. The problem is correlation: connecting the supervisor's invocation, the sub-agent's execution, the credential that was used, the data that was accessed, and the original human identity that started the chain into a single picture of what actually happened.

A Framework for Securing Agent-to-Agent Communication

Solving agent-to-agent communication security requires four operational controls. These controls do not require replacing existing platforms. They require adding a layer that operates across platforms, at runtime, with deterministic rules.

Identity Propagation Requirements

Every agent handoff must carry the original invoker's identity forward. This is not a default behavior in any current multi-agent platform. It requires explicit design. Security teams should require that any agent architecture passing tasks between agents includes a signed identity claim from the original human invoker. Downstream agents must validate that claim before executing instructions.

Without identity propagation, every sub-agent operates as an anonymous executor. With it, every action in the chain is attributable to a specific human identity that can be checked against authorization policies.

Context Minimization

Supervisor agents must pass only the context a sub-agent needs to complete its specific task. Full conversation forwarding is the default in many orchestration frameworks. It is the wrong default. Context minimization limits context bleed and reduces the data exposed if a sub-agent is compromised or misconfigured.

This mirrors the principle of least privilege applied to data flow rather than access grants. The least-privilege framework for AI agents applies at the permission layer and at the context layer simultaneously.

Audit Correlation Across Platforms

A complete audit trail for a multi-agent workflow requires correlating events across every platform the chain touches. This cannot be done manually at scale. It requires a layer that ingests events from each platform and reconstructs the chain using a common identifier, the original session or workflow ID, that persists across handoffs.

Obsidian's Identity Graph provides this correlation layer. By mapping relationships between agents, identities, applications, and actions across platforms, it produces a single authority map showing what each agent in a chain actually did, what data it accessed, and whether the original invoker had authorization for those actions. This is runtime truth, not theoretical configuration.

Deterministic Boundaries Between Agents

Probabilistic agents require deterministic guardrails. Each agent in a chain should operate within fixed, enforceable boundaries: specific data sources it can query, specific actions it can take, specific downstream agents it can invoke. These boundaries must be enforced at runtime, not defined in configuration and trusted to hold.

Deterministic guardrails break the trust transitivity problem. If Agent B cannot invoke Agent C regardless of what Agent A instructs, the chain terminates at the boundary. The blast radius of any single compromised or misconfigured agent becomes bounded rather than unlimited.

Security teams building multi-agent governance programs should start with a risk assessment that maps every existing agent chain, identifies cross-platform handoffs, and flags toxic combinations where excessive permissions meet public accessibility or orphaned ownership.

See how Obsidian maps cross-platform agent chains and enforces deterministic boundaries at runtime.

Frequently Asked Questions

What is agent-to-agent communication security?

Agent-to-agent communication security refers to the controls, visibility mechanisms, and enforcement policies that govern how AI agents pass context, credentials, and instructions to other AI agents. It addresses risks including identity loss at handoffs, credential forwarding through chains, and trust transitivity across platforms.

Why is multi-agent communication harder to secure than single-agent deployments?

Single-agent deployments operate within one platform's permission model and log system. Multi-agent architectures span platforms, identity systems, and audit logs. No single platform tool sees the full chain. Identity context drops at handoffs, and the blast radius of any misconfigured agent extends to every downstream agent it can invoke.

What is trust transitivity in multi-agent systems?

Trust transitivity is the implicit trust relationship that forms when Agent A trusts Agent B and Agent B trusts Agent C. Agent A effectively trusts Agent C without any explicit security review of that relationship. This mirrors the confused deputy problem and creates privilege escalation pathways that no IAM policy explicitly authorized.

How does context bleed expose sensitive data in agent chains?

Context bleed occurs when a supervisor agent forwards its full conversation history to a sub-agent. The sub-agent receives data from earlier conversation turns that it was never independently authorized to access. This is a default behavior in several orchestration frameworks and requires explicit context minimization controls to prevent.

What does identity propagation mean in agent-to-agent communication?

Identity propagation means carrying the original human invoker's identity forward through every agent handoff in a chain. Without it, downstream agents see only the calling agent's identity, making it impossible to verify whether the original human had authorization for the actions the chain ultimately took.

How can security teams detect cross-platform agent chain activity?

Per-platform logs capture individual agent events but do not correlate cross-platform chains. Detection requires a cross-platform layer that ingests events from each AI platform, maps them to a common workflow or session identifier, and reconstructs the full chain with identity context preserved at each step. This is the operational gap that runtime AI security platforms are designed to close.

- [What Are Agentic Guardrails? Deterministic Controls for Probabilistic Systems](/agentic-guardrails) - [AI Agent Privilege Escalation: How Agents Inherit Dangerous Permissions](/blog/ai-agent-privilege-escalation) - [Orphaned and Unsanctioned AI Agents: The Silent Security Risk](/blog/orphaned-ai-agents) - [Maker Mode Security: Why Fixed-Credential Agent Connections Are a Critical Risk](/blog/maker-mode-security)