
Google has formally launched a dedicated AI Vulnerability Reward Program (AI VRP), creating a structured channel for security researchers to report flaws in its AI systems for rewards of up to $30,0001, 2. Announced on October 6, 2025, this initiative builds upon a two-year effort where Google awarded over $430,000 for AI vulnerabilities within its existing Vulnerability Reward Programs1, 7. The new program distinguishes itself by clearly defining the types of AI-specific security flaws it seeks, moving beyond traditional software bugs to address the unique attack surfaces presented by large language models and AI integrations.
This program is strategically important for organizations relying on Google’s AI ecosystem. For security teams, it provides a clear framework for understanding the attack vectors that Google considers most critical in its AI products. The delineation between security vulnerabilities, which are in scope for bounties, and content safety issues, which are not, is a crucial distinction that guides both offensive and defensive security efforts. The program’s structure offers a valuable taxonomy for threat modeling AI applications, highlighting areas where malicious actors are likely to focus their efforts.
Program Scope and Vulnerability Taxonomy
The AI VRP focuses on Google’s most prominent AI products, categorized into tiers. Flagship products include Google Search, Gemini Apps across web and mobile platforms, and core Google Workspace applications like Gmail, Drive, Meet, and Calendar1, 2, 7. Other in-scope products include AI Studio, Jules, NotebookLM, and various AI integrations1, 4. Google has explicitly clarified that it is not seeking reports of AI hallucinations or merely factually incorrect outputs, as these are considered content issues rather than security vulnerabilities2, 4.
The vulnerability categories reveal Google’s prioritization of attacks with tangible security impact. The highest bounties target flaws that enable unauthorized actions or data exposure through AI systems. This classification system provides security researchers with a targeted approach for testing AI implementations, focusing on scenarios where AI model behavior can be manipulated to achieve traditional security compromise objectives. The categorization reflects an understanding that while AI systems introduce new attack vectors, the ultimate security impact often aligns with conventional security concerns like unauthorized access and data exfiltration.
Category | Description | Example | Max Bounty (Flagship) |
---|---|---|---|
S1: Rogue Actions | Attacks that modify a victim’s account or data with clear security impact | Indirect prompt injection causing Google Home to unlock a door2 | $20,0001 |
S2: Sensitive Data Exfiltration | Unauthorized extraction of sensitive data | Prompt injection that summarizes user emails and sends to attacker2 | $15,0001 |
A1: Phishing Enablement | Using AI to facilitate phishing attacks | Hijacking Gemini to display fake email summaries4 | $5,0001 |
A2: Model Theft | Theft of secret model parameters or weights | N/A | $5,0001 |
Reward Structure and Testing Methodology
The base reward for the most severe “Rogue Actions” vulnerability in flagship products is $20,0001, 7. Researchers can earn additional bonus multipliers for high-quality reports and novelty, potentially increasing total payouts to a maximum of $30,000 (approximately ₹26 lakh)1, 2, 9. Rewards are proportionally lower for the same vulnerability categories found in “Standard” or “Other” tier products like Jules or AI Studio1, 7. This tiered approach incentivizes research on the most widely used AI products where vulnerabilities would have the broadest impact.
For security testers, the provided examples offer concrete testing scenarios. The “Rogue Actions” category specifically mentions indirect prompt injection attacks that could manipulate smart home devices through Google Home or cause Google Calendar entries to trigger unauthorized actions like opening smart shutters2, 6, 7. These examples illustrate the potential for AI systems to serve as unconventional attack vectors against connected infrastructure. Testing should focus on how AI interactions with other systems and APIs might be manipulated to achieve security impacts beyond the AI interface itself.
Distinction Between Security and Content Issues
A critical aspect of the AI VRP is its explicit exclusion of content-related issues. The program does not cover AI hallucinations, generation of hate speech, biased content, or copyright-infringing material2, 4, 6. Google considers these problems with the model’s content and safety training rather than security vulnerabilities. These issues should be reported through in-product feedback channels so Google’s AI safety teams can address them through model-wide retraining2, 4.
This distinction is important for researchers to understand, as it focuses the bug bounty program specifically on vulnerabilities that enable traditional security compromises through AI systems. The separation reflects Google’s organizational structure, with different teams handling model safety versus security vulnerabilities. For organizations implementing similar programs, this approach provides a model for distinguishing between AI reliability concerns and genuine security vulnerabilities that could be exploited by attackers.
Complementary Security Initiative: CodeMender
Alongside the bug bounty program, Google’s DeepMind announced CodeMender, an AI agent designed to automatically find and patch vulnerabilities in open-source code2, 7. According to DeepMind, the agent proactively rewrites existing code to use more secure data structures and APIs7. Google reported that CodeMender has already been used to patch 72 security fixes in open-source projects, including some with up to 4.5 million lines of code, after human review over the past six months2, 7.
This complementary initiative represents Google’s investment in both offensive and defensive AI security applications. While the bug bounty program incentivizes external researchers to find vulnerabilities, CodeMender represents an automated approach to vulnerability remediation. The implementation of such tools could significantly impact software supply chain security, particularly for widely used open-source dependencies that form the foundation of many enterprise applications.
Broader Context and Strategic Implications
The AI VRP is part of Google’s substantial investment in security through vulnerability rewards. In 2024 alone, Google awarded nearly $12 million in total bug bounties to 632 researchers1. Since launching its first VRP in 2010, Google has paid out a cumulative $65 million in rewards1. This established infrastructure provides a solid foundation for the new AI-focused program, with well-defined submission processes and reward mechanisms.
For security professionals, Google’s vulnerability taxonomy provides a framework for assessing AI system security beyond Google’s products. The categories of “Rogue Actions” and “Sensitive Data Exfiltration” represent concrete threats that should be considered in any AI implementation. Organizations developing or deploying AI systems can use this classification to inform their own security testing and threat modeling efforts, focusing on scenarios where AI model behavior could lead to traditional security compromises.
The launch of Google’s AI VRP represents a significant step in maturing AI security practices. By providing clear guidelines and substantial incentives, Google is encouraging systematic security research into AI systems while distinguishing between security vulnerabilities and content safety issues. The program’s structure offers valuable insights into the most critical attack vectors for AI implementations, providing a framework that security teams can apply to their own AI security assessments. As AI integration continues to expand across enterprise systems, this focused approach to identifying and mitigating AI-specific security risks will become increasingly important for maintaining overall security posture.
References
- BleepingComputer, “Google’s new AI bug bounty program pays up to $30,000 for flaws,” Oct. 7, 2025.
- The Verge, “Google launches AI bug bounty program to find vulnerabilities in Gemini and search,” Oct. 6, 2025.
- Times of India, “Google starts AI bug bounty programme with rewards up to $30,000,” Oct. 7, 2025.
- TechRadar, “Google will pay you up to $30,000 if you can find security flaws in its AI,” Oct. 7, 2025.
- India Today, “Google’s new AI bug bounty programme: How to earn up to Rs 26 lakh by reporting vulnerabilities,” Oct. 7, 2025.
- Moneycontrol, “Google launches AI bug bounty programme with rewards of up to $30,000: How to participate,” Oct. 7, 2025.
- Heise Online, “Google belohnt das Finden von KI-Sicherheitslücken mit bis zu 30.000 Dollar,” Oct. 7, 2025.
- Gizbot, “Google’s AI Bug Bounty Program Offers Up To $30,000 For Finding Flaws: Report,” Oct. 7, 2025.