In September, we shared how we are implementing the voluntary AI commitments that we and others in industry made at the White House in July. One of the most important developments involves expanding our existing Bug Hunter Program to foster third-party discovery and reporting of issues and vulnerabilities specific to our AI systems. Today, we’re publishing more details on these new reward program elements for the first time. Last year we issued over $12 million in rewards to security researchers who tested our products for vulnerabilities, and we expect today’s announcement to fuel even greater collaboration for years to come. 



What’s in scope for rewards 

In our recent AI Red Team report, we identified common tactics, techniques, and procedures (TTPs) that we consider most relevant and realistic for real-world adversaries to use against AI systems. The following table incorporates shared learnings from Google’s AI Red Team exercises to help the research community better understand what’s in scope for our reward program. We’re detailing our criteria for AI bug reports to assist our bug hunting community in effectively testing the safety and security of AI products. Our scope aims to facilitate testing for traditional security vulnerabilities as well as risks specific to AI systems. It is important to note that reward amounts are dependent on severity of the attack scenario and the type of target affected (go here for more information on our reward table). 




Category

Attack Scenario

Guidance

Prompt Attacks: Crafting adversarial prompts that allow an adversary to influence the behavior of the model, and hence the output in ways that were not intended by the application.

Prompt injections that are invisible to victims and change the state of the victim’s account or or any of their assets.

In Scope

Prompt injections into any tools in which the response is used to make decisions that directly affect victim users.

In Scope

Prompt or preamble extraction in which a user is able to extract the initial prompt used to prime the model only when sensitive information is present in the extracted preamble.

In Scope

Using a product to generate violative, misleading, or factually incorrect content in your own session: e.g. ‘jailbreaks’. This includes ‘hallucinations’ and factually inaccurate responses. Google’s generative AI products already have a dedicated reporting channel for these types of content issues.

Out of Scope

Training Data Extraction: Attacks that are able to successfully reconstruct verbatim training examples that contain sensitive information. Also called membership inference.

Training data extraction that reconstructs items used in the training data set that leak sensitive, non-public information.

In Scope

Extraction that reconstructs nonsensitive/public information.

Out of Scope

Manipulating Models: An attacker able to covertly change the behavior of a model such that they can trigger pre-defined adversarial behaviors.

Adversarial output or behavior that an attacker can reliably trigger via specific input in a model owned and operated by Google (“backdoors”). Only in-scope when a model’s output is used to change the state of a victim’s account or data. 

In Scope

Attacks in which an attacker manipulates the training data of the model to influence the model’s output in a victim’s session according to the attacker’s preference. Only in-scope when a model’s output is used to change the state of a victim’s account or data. 

In Scope

Adversarial Perturbation: Inputs that are provided to a model that results in a deterministic, but highly unexpected output from the model.

Contexts in which an adversary can reliably trigger a misclassification in a security control that can be abused for malicious use or adversarial gain. 

In Scope

Contexts in which a model’s incorrect output or classification does not pose a compelling attack scenario or feasible path to Google or user harm.

Out of Scope

Model Theft / Exfiltration: AI models often include sensitive intellectual property, so we place a high priority on protecting these assets. Exfiltration attacks allow attackers to steal details about a model such as its architecture or weights.

Attacks in which the exact architecture or weights of a confidential/proprietary model are extracted.

In Scope

Attacks in which the architecture and weights are not extracted precisely, or when they’re extracted from a non-confidential model.

Out of Scope

If you find a flaw in an AI-powered tool other than what is listed above, you can still submit, provided that it meets the qualifications listed on our program page.

A bug or behavior that clearly meets our qualifications for a valid security or abuse issue.

In Scope

Using an AI product to do something potentially harmful that is already possible with other tools. For example, finding a vulnerability in open source software (already possible using publicly-available static analysis tools) and producing the answer to a harmful question when the answer is already available online.

Out of Scope

As consistent with our program, issues that we already know about are not eligible for reward.

Out of Scope

Potential copyright issues: findings in which products return content appearing to be copyright-protected. Google’s generative AI products already have a dedicated reporting channel for these types of content issues.

Out of Scope


Conclusion 

We look forward to continuing our work with the research community to discover and fix security and abuse issues in our AI-powered features. If you find a qualifying issue, please go to our Bug Hunter website to send us your bug report and–if the issue is found to be valid–be rewarded for helping us keep our users safe.

An overview of the activities of selected APT groups investigated and analyzed by ESET Research in Q2 and Q3 2023

Last week at Singapore International Cyber Week and the ETSI Security Conferences, the international community gathered together to discuss cybersecurity hot topics of the day. Amidst a number of important cybersecurity discussions, we want to highlight progress on connected device security demonstrated by  joint industry principles for IoT security transparency. The future of connected devices offers tremendous potential for innovation and quality of life improvements. Putting a spotlight on consumer IoT security is a key aspect of achieving these benefits. Marketplace competition can be an important driver of security improvements, with consumers empowered and motivated to make informed purchasing decisions based on device security. 

As with other IoT security transparency initiatives globally, it’s great to see this topic being covered at both conferences this week. The below IoT security labeling principles are aimed at helping to improve consumer awareness and to foster marketplace competition based on security.

To help consumers make an informed purchase decision they should receive clear, consistent, and actionable information about the security of the device (e.g. security support period, authentication support, cryptographic assurance) before purchase – a communication and transparency mechanism commonly referred to as “a label” or “labeling,” although the communication is not merely a printed sticker on physical product packaging. While an IoT label will not solve the problem of IoT security on its own, transparency can both help educate consumers and also facilitate the coordination of security responsibilities between all of the components in a connected device ecosystem.

Our goal is to strengthen the security of IoT devices and ecosystems to protect individuals and organizations, and to unleash the full future benefit of IoT. Security labeling programs can support consumer purchase decisions that drive security improvements, but only if the label is credible, actionable, and easily understood. We are hopeful that the public sector and industry can work together to drive harmonized policies that achieve this goal. 

Signed,

Google

ARM

Assa Abloy

Finite State

HackerOne

Keysight

NXP

OpenPolicy

Rapid7

Schlage

Silicon Labs

ESET Research recommends updating Roundcube Webmail to the latest available version as soon as possible

Why use and keep track of a zillion discrete accounts when you can log into so many apps and websites using your Facebook or Google credentials, right? Not so fast. What’s the trade-off?

ESET’s analysis of cybercrime campaigns in Latin America reveals a notable shift from opportunistic crimeware to more complex threats, including those targeting enterprises and governments

Knowledge is a powerful weapon that can empower your employees to become the first line of defense against threats

How robust backup practices can help drive resilience and improve cyber-hygiene in your company

ESET researchers reveal a growing sophistication in threats affecting the LATAM region by employing evasion techniques and high-value targeting

Why keeping software up to date is a crucial security practice that should be followed by everyone from individual users to SMBs and large enterprises