Radar
Radar
Radar
Special thanks to John Schulman for a lot of super valuable feedback and direct edits on this post. Test time compute (Graves et al.
Related

An AI agent pursues a goal by iteratively taking actions, evaluating progress, and deciding next steps. Useful agents must be reliable, adaptive, and accurate.

OpenAI built a GPT-5.4-powered monitor that analyzes coding agents' reasoning chains in real time, logging about 1,000 alerts across tens of millions of agentic interactions over five months.