Blog

Double Agent: Service Agent Privilege Escalation in Google Vertex AI

Posted by: Eli Shparaga, Erez Hasson
January 15, 2026
Getting your Trinity Audio player ready...

TL;DR

While analyzing Google’s Vertex AI, we discovered two distinct attack vectors, specifically in Ray on Vertex AI and the Vertex AI Agent Engine, where default configurations allow low-privileged users to pivot into higher-privileged Service Agent roles.

As organizations rush to integrate Generative AI, with 98% of enterprises currently experimenting or deploying infrastructure like Google Cloud Vertex, AI has become a critical building block. However, the speed of this transformation has introduced overlooked identity risks.

Central to this is the role of Service Agents: special service accounts created and managed by Google Cloud that allow services to access your resources and perform internal processes on your behalf. Because these “invisible” managed identities are required for services to function, they are often automatically granted broad project-wide permissions.

We recently analyzed Google’s Vertex AI and found two attack vectors within the Vertex AI Agent Engine and Ray on Vertex AI. These vulnerabilities allow an attacker with minimal permissions to hijack high-privileged Service Agents, effectively turning these “invisible” managed identities into “double agents” that facilitate privilege escalation.

When we disclosed the findings to Google, their rationale was that the services are currently “working as intended.” Because this configuration remains the default today, platform engineers and security teams must understand the technical mechanics of these attacks to immediately secure their environments.

In this blog, we will analyze each of these attack vectors, explain their potential impact, and provide recommendations for engineers and security teams for securing their instances.

The “Double Agent” Problem: A Confused Deputy Attack

Both attack paths identified rely on a common mechanism: a low-privileged user (e.g., “Viewer”) interacts with a compute instance managed by Vertex AI, achieves code execution, and extracts the credentials of an attached Service Agent. While the initial user has limited rights, the hijacked Service Agent often possesses broad project-wide permissions.

Feature Vertex AI Agent Engine Ray on Vertex AI
Primary Target Reasoning Engine Service Agent Custom Code Service Agent 
Vulnerability Type Malicious Tool Call (RCE) Insecure Default Access (Viewer to Root)
Initial Permission aiplatform.reasoningEngines.update aiplatform.persistentResources.get/list
Impact Access to LLM memories, chats, and GCS Root access to the Ray cluster; Read/Write to BigQuery & GCS

Vulnerability #1: Vertex AI Agent Engine Tool Injection

The Vertex AI Agent Engine allows developers to deploy AI agents on GCP infrastructure. This supports a few frameworks for developing the agents, as described in GCP’s documentation.

Some of these frameworks, like Google’s ADK, allow developers to upload code onto abstracted compute instances called reasoning engines.

Following the example in the ADK deployment tutorial, we have 2 main Python files; one is responsible for the agent’s logic, such as tool calls, and the other is responsible for the deployment process.

The deployment process involves pickling (serializing) Python code and uploading it to a staging Google Cloud Storage (GCS) bucket.


Generally, the process for deploying an agent via ADK is as follows:

Because the tool calls defined in the agent code can be any Python code, we tried placing a reverse shell inside one of the tool calls and updating the engine, which worked and gave us access to the compute instance backing the reasoning engine. 

After we made a query to trigger a tool call, our reverse shell ran and gave us access to the Reasoning Engine’s compute instance. We then identified that the attached identity was the default Reasoning Engine Service Agent used by the ADK.

With the proof of concept established, our next step was to leverage this capability to escalate privileges.

When trying to whittle down the required permissions to perform the update, we discovered that a public bucket from any account could be used as the staging bucket.

Removing the need for storage permissions resulted in these minimum permissions for updating a reasoning engine via the ADK.

Since there are privileges gained that are not part of the privileges required to perform this action, this is a privilege escalation via a confused deputy attack.

The Escalation Path

This attack vector targets the aiplatform.reasoningEngines.update permission.

  1. Malicious update: An attacker with access to the project updates an existing reasoning engine with a tool containing malicious code. In our Proof of Concept, we embedded a Python reverse shell inside a standard currency conversion tool.
  2. Remote Code Execution (RCE): When the tool is called, either by the attacker or a legitimate user, the shell executes on the reasoning engine’s instance.
  3. Confused Deputy: From the compromised instance, the attacker accesses the instance metadata service to request the token for the “Reasoning Engine Service Agent” (default: service-<project_id>@gcp-sa-aiplatform-re.iam.gserviceaccount.com).

Technical Impact

By default, this service agent is attached to the instance, allowing for privilege escalation via a confused deputy attack. The privileges gained include access to:

  • Vertex AI: aiplatform.memories.*,aiplatform.sessionEvents.*,aiplatform.sessions.get,aiplatform.sessions.list,aiplatform.sessions.update.
  • Storage: storage.buckets.get, storage.buckets.list, storage.objects.get, storage.objects.list.
  • Resource Management: resourcemanager.projects.get.
  • Monitoring & Logging: logging.logEntries.create, monitoring.timeSeries.create.

Practically, this allows an attacker to read all chat sessions, read LLM memories, and read potentially sensitive information stored in storage buckets.

Vulnerability #2: Ray on Vertex AI – Viewer to Root

The Ray on Vertex AI feature allows ML engineers to leverage GCP’s infrastructure together with the Ray library, which aids in developing scalable AI workloads.

While analyzing this feature, we found that when a Ray cluster is deployed, the “Custom Code Service Agent” is automatically attached to the cluster’s head node.

To our surprise, we discovered that an attacker with control over an identity that has the aiplatform.persistentResources.list and aiplatform.persistentResources.get in an environment in which a “Ray on Vertex AI” cluster exists can connect to the head node via the GCP UI and get root access to the node.

The attacker can then use the metadata service to retrieve the access token for the service agent and use its privileges in the project.

These privileges are also included in the default “Vertex AI Viewer” role.

The Escalation Path

An attacker requires only an identity with aiplatform.persistentResources.list and aiplatform.persistentResources.get permissions. These are standard permissions found in the read-only Vertex AI Viewer role.

  1. Gaining root: The attacker navigates to the cluster in the GCP Console. Despite having only “Viewer” permissions, the interface exposes a “Head node interactive shell” link. Clicking this grants an interactive shell on the head node with root privileges.
  2. Token extraction: With root access, the attacker queries the metadata service to retrieve the access token for the Custom Code Service Agent.

Technical Impact

The Custom Code Service Agent role has extensive permissions, including iam.serviceAccounts.signBlob and iam.serviceAccounts.getAccessToken. However, our testing confirmed that the extracted token has a limited scope, meaning IAM operations are blocked.

Critically, the token does possess the following scopes:

This allows a “Viewer” to read and write to various services, including storage buckets, logs, and BigQuery resources.

Securing Your Agents

While Google has forwarded this report to their product team to determine if a fix is required, the current “working as intended” status means there is no guarantee of an immediate patch. Until a change is made, it is up to engineering and security teams to ensure such “double agents” are not created within their projects.

To mitigate these risks:

  • For Ray on Vertex AI: Audit identities with the “Viewer” role. Restrict aiplatform.persistentResources permissions to only those requiring compute access.
  • For the Agent Engine: Tightly restrict the aiplatform.reasoningEngines.update permission to prevent unauthorized code injection.
  • Monitoring: Leverage the monitoring capabilities provided by the Agent Engine Threat Detection feature.

 


mxcyber

Eli Shparaga

Check Out More Resources

View More
The 3 Key Ingredients to Getting CTEM Right
Jason Fruge | November 17, 2024

As a CISO with over 25 years of experience across diverse industries, I’ve seen the limitations of traditional vulnerability management firsthand. It’s often a…

Why and How to Adopt the CTEM Framework

Attack Surfaces are expanding as organizations invest in Cloud, SaaS and third-party supplier relationships to support business needs. At the same time, security teams…

Active Directory Security Checklist

Active Directory is the key to your network, responsible for connecting users with network resources – but it’s also a prime target for attackers….

Buyer’s Guide to Meeting and Maintaining CTEM

The movement from fractured Vulnerability Management processes to integrated Exposure Management efforts has helped organizations take greater control of the issues that put them…

Research Report: 2023 State of Exposure Management

Don’t miss out on exclusive research that explores the challenges organizations face in managing security exposures and provides insights on how to overcome them….

Establishing a Modern Exposure Management Program

This session provides a comprehensive overview of the evolution of vulnerability management and explains why critical vulnerabilities do not necessarily equal risk. By watching…

Buyers Guide: Risk Exposure Reduction and Vulnerability Prioritization

2023 is almost here and security teams are focused on locking-in the funds needed to keep their orgs secured in the coming year. But…

The Necessity of Attack Path Management for the Hybrid Cloud

Published in collaboration with the UK Chapter of the Cloud Security Alliance, this whitepaper explores the necessity of attack path management for today’s hybrid…

Case Study: How XM Cyber Helps Hamburg Port Authority Secure its Vast IT Infrastructure

“Because it offers continuous,  automated protection, security issues  that would normally take dozens of  manual steps to discover are surfaced  almost instantaneously.  We have historically…

‘Total Economic Impact’ Study Concludes That XM Cyber Delivered 394% Return On Investment

Attack Path Management Significantly Reduces Risk of Fines and Remediation Expenditures, Reduces Pen Testing and Labor Costs

See what attackers see, so you can stop them from doing what attackers do.