Radar
Radar
Radar

Reward hacking occurs when a reinforcement learning (RL)) agent exploits flaws or ambiguities in the reward function to achieve high rewards, without genuinely learning or complet...
Related

An AI agent pursues a goal by iteratively taking actions, evaluating progress, and deciding next steps. Useful agents must be reliable, adaptive, and accurate.

OpenAI built a GPT-5.4-powered monitor that analyzes coding agents' reasoning chains in real time, logging about 1,000 alerts across tens of millions of agentic interactions over five months.