|
Getting your Trinity Audio player ready...
|
TL;DR
While analyzing Google’s Vertex AI, we discovered two distinct attack vectors, specifically in Ray on Vertex AI and the Vertex AI Agent Engine, where default configurations allow low-privileged users to pivot into higher-privileged Service Agent roles.

As organizations rush to integrate Generative AI, with 98% of enterprises currently experimenting or deploying infrastructure like Google Cloud Vertex, AI has become a critical building block. However, the speed of this transformation has introduced overlooked identity risks.
Central to this is the role of Service Agents: special service accounts created and managed by Google Cloud that allow services to access your resources and perform internal processes on your behalf. Because these “invisible” managed identities are required for services to function, they are often automatically granted broad project-wide permissions.
We recently analyzed Google’s Vertex AI and found two attack vectors within the Vertex AI Agent Engine and Ray on Vertex AI. These vulnerabilities allow an attacker with minimal permissions to hijack high-privileged Service Agents, effectively turning these “invisible” managed identities into “double agents” that facilitate privilege escalation.
When we disclosed the findings to Google, their rationale was that the services are currently “working as intended.” Because this configuration remains the default today, platform engineers and security teams must understand the technical mechanics of these attacks to immediately secure their environments.
In this blog, we will analyze each of these attack vectors, explain their potential impact, and provide recommendations for engineers and security teams for securing their instances.
The “Double Agent” Problem: A Confused Deputy Attack
Both attack paths identified rely on a common mechanism: a low-privileged user (e.g., “Viewer”) interacts with a compute instance managed by Vertex AI, achieves code execution, and extracts the credentials of an attached Service Agent. While the initial user has limited rights, the hijacked Service Agent often possesses broad project-wide permissions.
| Feature | Vertex AI Agent Engine | Ray on Vertex AI |
| Primary Target | Reasoning Engine Service Agent | Custom Code Service Agent |
| Vulnerability Type | Malicious Tool Call (RCE) | Insecure Default Access (Viewer to Root) |
| Initial Permission | aiplatform.reasoningEngines.update | aiplatform.persistentResources.get/list |
| Impact | Access to LLM memories, chats, and GCS | Root access to the Ray cluster; Read/Write to BigQuery & GCS |
Vulnerability #1: Vertex AI Agent Engine Tool Injection
The Vertex AI Agent Engine allows developers to deploy AI agents on GCP infrastructure. This supports a few frameworks for developing the agents, as described in GCP’s documentation.
Some of these frameworks, like Google’s ADK, allow developers to upload code onto abstracted compute instances called reasoning engines.
Following the example in the ADK deployment tutorial, we have 2 main Python files; one is responsible for the agent’s logic, such as tool calls, and the other is responsible for the deployment process.
The deployment process involves pickling (serializing) Python code and uploading it to a staging Google Cloud Storage (GCS) bucket.
Generally, the process for deploying an agent via ADK is as follows:
Because the tool calls defined in the agent code can be any Python code, we tried placing a reverse shell inside one of the tool calls and updating the engine, which worked and gave us access to the compute instance backing the reasoning engine.
After we made a query to trigger a tool call, our reverse shell ran and gave us access to the Reasoning Engine’s compute instance. We then identified that the attached identity was the default Reasoning Engine Service Agent used by the ADK.
With the proof of concept established, our next step was to leverage this capability to escalate privileges.
When trying to whittle down the required permissions to perform the update, we discovered that a public bucket from any account could be used as the staging bucket.
Removing the need for storage permissions resulted in these minimum permissions for updating a reasoning engine via the ADK.
Since there are privileges gained that are not part of the privileges required to perform this action, this is a privilege escalation via a confused deputy attack.
The Escalation Path
This attack vector targets the aiplatform.reasoningEngines.update permission.
- Malicious update: An attacker with access to the project updates an existing reasoning engine with a tool containing malicious code. In our Proof of Concept, we embedded a Python reverse shell inside a standard currency conversion tool.
- Remote Code Execution (RCE): When the tool is called, either by the attacker or a legitimate user, the shell executes on the reasoning engine’s instance.
- Confused Deputy: From the compromised instance, the attacker accesses the instance metadata service to request the token for the “Reasoning Engine Service Agent” (default: service-<project_id>@gcp-sa-aiplatform-re.iam.gserviceaccount.com).
Technical Impact
By default, this service agent is attached to the instance, allowing for privilege escalation via a confused deputy attack. The privileges gained include access to:
- Vertex AI: aiplatform.memories.*,aiplatform.sessionEvents.*,aiplatform.sessions.get,aiplatform.sessions.list,aiplatform.sessions.update.
- Storage: storage.buckets.get, storage.buckets.list, storage.objects.get, storage.objects.list.
- Resource Management: resourcemanager.projects.get.
- Monitoring & Logging: logging.logEntries.create, monitoring.timeSeries.create.
Practically, this allows an attacker to read all chat sessions, read LLM memories, and read potentially sensitive information stored in storage buckets.
Vulnerability #2: Ray on Vertex AI – Viewer to Root
The Ray on Vertex AI feature allows ML engineers to leverage GCP’s infrastructure together with the Ray library, which aids in developing scalable AI workloads.
While analyzing this feature, we found that when a Ray cluster is deployed, the “Custom Code Service Agent” is automatically attached to the cluster’s head node.
To our surprise, we discovered that an attacker with control over an identity that has the aiplatform.persistentResources.list and aiplatform.persistentResources.get in an environment in which a “Ray on Vertex AI” cluster exists can connect to the head node via the GCP UI and get root access to the node.
The attacker can then use the metadata service to retrieve the access token for the service agent and use its privileges in the project.
These privileges are also included in the default “Vertex AI Viewer” role.
The Escalation Path
An attacker requires only an identity with aiplatform.persistentResources.list and aiplatform.persistentResources.get permissions. These are standard permissions found in the read-only Vertex AI Viewer role.
- Gaining root: The attacker navigates to the cluster in the GCP Console. Despite having only “Viewer” permissions, the interface exposes a “Head node interactive shell” link. Clicking this grants an interactive shell on the head node with root privileges.
- Token extraction: With root access, the attacker queries the metadata service to retrieve the access token for the Custom Code Service Agent.
Technical Impact
The Custom Code Service Agent role has extensive permissions, including iam.serviceAccounts.signBlob and iam.serviceAccounts.getAccessToken. However, our testing confirmed that the extracted token has a limited scope, meaning IAM operations are blocked.
Critically, the token does possess the following scopes:
- https://www.googleapis.com/auth/devstorage.full_control
- https://www.googleapis.com/auth/bigquery
- https://www.googleapis.com/auth/pubsub
- https://www.googleapis.com/auth/cloud-platform.read-only
This allows a “Viewer” to read and write to various services, including storage buckets, logs, and BigQuery resources.
Securing Your Agents
While Google has forwarded this report to their product team to determine if a fix is required, the current “working as intended” status means there is no guarantee of an immediate patch. Until a change is made, it is up to engineering and security teams to ensure such “double agents” are not created within their projects.
To mitigate these risks:
- For Ray on Vertex AI: Audit identities with the “Viewer” role. Restrict aiplatform.persistentResources permissions to only those requiring compute access.
- For the Agent Engine: Tightly restrict the aiplatform.reasoningEngines.update permission to prevent unauthorized code injection.
- Monitoring: Leverage the monitoring capabilities provided by the Agent Engine Threat Detection feature.









