OpenAI launches GPT-5.4 mini and nano — smaller, faster, cheaper | VibeHub

Briefs

AI news brief archive

OpenAI launches GPT-5.4 mini and nano — smaller, faster, cheaper | VibeHub

Mar 24

OpenAI launches GPT-5.4 mini and nano — smaller, faster, cheaper

2 min read3 sources

OpenAI released GPT-5.4 mini and nano, two compact models designed to bring frontier-level reasoning to cost-sensitive and latency-critical applications while maintaining strong benchmark scores.

Watch

Watch this brief on YouTube

Prefer the video version? This brief now has a connected YouTube upload.

Watch on YouTube

OpenAI has released two new additions to the GPT-5 family: GPT-5.4 mini and GPT-5.4 nano. Both models are optimized for speed and cost efficiency, targeting developers who need high-quality reasoning without the latency and price tag of the full GPT-5.4 model. Mini offers roughly 90% of GPT-5.4's benchmark performance at one-fifth the cost, while nano pushes the envelope further with sub-100ms response times for simple tasks.

Why it matters

The launch signals OpenAI's strategic shift toward model accessibility. As competitors like Anthropic (Claude Haiku) and Google (Gemini Flash) have shown, the real growth in AI adoption comes not from the largest models but from affordable, fast ones that developers can embed everywhere — chatbots, code completions, content filters, and real-time agents. Mini and nano fill the gap that GPT-4o mini left behind, with meaningfully better reasoning at similar price points.

What's new in the architecture

Both models use a distillation pipeline from GPT-5.4, retaining its chain-of-thought capabilities in a compressed parameter space. OpenAI reports that mini scores 82.1% on MMLU-Pro (vs 87.3% for the full model) and nano handles function calling with 95% accuracy on the Berkeley Function Calling Leaderboard. Notably, nano supports a 32K context window — double what GPT-4o mini offered — making it viable for document-level tasks.

Competitive context

This release comes two weeks after Anthropic's Claude 4.5 Haiku update and one month after Google's Gemini 2.5 Flash launch. The small-model race is now a three-way competition where price-per-token, latency, and tool-use reliability are the key differentiators. For VibeHub's pipeline, mini is a strong candidate for the classifier and brief-draft stages where speed matters more than peak creativity.

Sources

openai.comtheverge.complatform.openai.com

OpenAI Blog The Verge OpenAI API Docs