Introducing the OpenAI Safety Bug Bounty program | VibeHub

Briefs

AI news brief archive

Mar 27

Introducing the OpenAI Safety Bug Bounty program

3 min read1 sources

Introducing the OpenAI Safety Bug Bounty program

OpenAI launches a Safety Bug Bounty program to identify AI abuse and safety risks, including agentic vulnerabilities, prompt injection, and data exfiltration.

Watch

Watch this brief on YouTube

Prefer the video version? This brief now has a connected YouTube upload.

Watch on YouTube

Testing for safety and abuse issues across OpenAI

Today, OpenAI is launching a public Safety Bug Bounty program focused on identifying AI abuse and safety risks across our products. As AI technology rapidly evolves, so do the potential ways it can be misused. Our goal is to ensure our systems remain safe and secure against misuse or abuse that could lead to tangible harm.

This new program will complement OpenAI’s Security Bug Bounty by accepting issues that pose meaningful abuse and safety risks, even if they don’t meet the criteria for a security vulnerability. Through this program, we look forward to continuing to partner with safety and security researchers to help us identify and address issues that fall outside conventional security vulnerabilities but still pose real risks. Submissions will be triaged by OpenAI’s Safety and Security Bug Bounty teams, and may be rerouted between the two programs depending on scope and ownership.

Program overview

The new Safety Bug Bounty program focuses on AI-specific safety scenarios listed below:

Agentic Risks including MCP

• Third party prompt injection and data exfiltration: when attacker text is able to reliably hijack a victim’s agent (including Browser, ChatGPT Agent, and similar agentic products) to trick it into performing a harmful action or leaking the user’s sensitive information. The behavior must be reproducible at least 50% of the time. • An agentic OpenAI product performs a disallowed action on OpenAI’s website at scale. • An agentic OpenAI product performs some potentially harmful action not listed above. Valid reports here must indicate plausible and material harm. • Any testing for MCP risk must comply with the terms of service of any third parties.

OpenAI Proprietary Information

• Model generations that return proprietary information related to reasoning. • Vulnerabilities that expose other OpenAI proprietary information.

Account and Platform Integrity

• Vulnerabilities in account integrity and platform integrity signals, such as bypassing anti-automation controls, manipulating account trust signals, evading account restrictions/suspensions/bans, and similar issues. • Issues that allow users to access features, data, or functionalities beyond authorized permissions should be reported to the Security Bug Bounty.

While jailbreaks are out of scope for this program, we periodically run private bug bounty campaigns focused on certain harm types, such as Biorisk content issues in ChatGPT Agent and GPT‑5. We invite interested researchers to apply to these programs when they arise.

Sources

openai.com

Introducing the OpenAI Safety Bug Bounty program | VibeHub