
With tech companies racing to ship AI browser agents and other agentic AI-powered tools, security researchers have warned that these products could open the door to a new class of vulnerabilities and cyber attacks. But Google has suggested a unique way to address these risks: use another AI model to monitor what the AI agents do.
The search giant on Monday, November 8, announced that it is upgrading the Chrome browser with a new security layer comprising a separate large language model (LLM) called ‘User Alignment Critic’ that is isolated from untrusted web content and will vet the actions of the AI browser agent.
The announcement comes as Google looks to upgrade Chrome with agentic capabilities, currently in preview, following the recent integration of its Gemini AI chatbot in the popular web browser (currently accessible only in the US).
Google said it is also adding origin-isolation capabilities in Chrome to constrain what origins the AI agents can interact with. “Our layered defense also includes user confirmations for critical steps, real-time detection of threats, and red-teaming and response,” the Alphabet-owned company said.
This new security architecture in Chrome is designed to mitigate indirect prompt injection attacks, where threat actors hide malicious instructions in iframes and web pages to trick the AI browser agent into carrying out financial transactions or exfiltrating sensitive data from logged in accounts.
A few months ago, security researchers at Brave had flagged a potential vulnerability in Perplexity’s agentic AI browser, Comet, that could allow attackers to maliciously instruct the browser agent via indirect prompt injection and gain access to sensitive user data, including emails, banking passwords, and other personal information.
Antigravity, Google’s new AI agent-driven software development platform, is also prone to indirect prompt injection attacks, and this has been classified as a known issue by the company on its bug-hunting page.
– User Alignment Critic: It is an isolated Gemini model that is designed to double-check each proposed action by the ‘planner model’. If the action is misaligned, the model will veto it and provide feedback to the planner model so that it can re-formulate its plan.
– Origin Sets: This feature restricts the AI agent’s access to the web and only permits interactions with specific sites that are related to the task at hand, or data that the user has chosen to share with the agent.
-User control and oversight: In addition to Chrome’s real-time scanning that detects more traditional scams, the browser will also run a prompt injection classifier to check every page it sees for indirect prompt injection.
Google further said that it has developed automated red-teaming systems that generate test sites and LLM-driven attacks to continuously test its own defenses. The tech giant has also announced bounty payments of up to $20,000 for anyone who flags breaches in the new security layer.