OpenAI’s latest AI models have a new safeguard to prevent biorisks

techcrunch.com•

4m4 months ago

OpenAI has introduced a new monitoring system for its AI reasoning models, o3 and o4-mini, to prevent them from providing harmful advice related to biological and chemical threats. This safety-focused reasoning monitor is designed to identify and block prompts that could lead to dangerous actions, achieving a 98.7% success rate in tests.

The new models, o3 and o4-mini, are more capable than previous versions, raising concerns about their potential misuse. OpenAI acknowledges the need for ongoing human monitoring to address risks that automated systems may not fully cover, especially as users may attempt new prompts after being blocked.

Despite the advancements, some researchers express concerns about OpenAI's commitment to safety, citing limited testing time for deceptive behavior and the absence of a safety report for the recently launched GPT-4.1 model.