OpenAI shrinks GPT-5.4 for speed and lower costs
|
By
Paulo Vargas Published March 18, 2026 |
OpenAI is scaling its latest models down to hit a different target, faster responses and much lower costs. The new GPT-5.4 mini and nano are built for developers who care more about responsiveness than squeezing out every last bit of reasoning power.
Both models are available starting today. GPT-5.4 mini runs more than twice as fast as its predecessor while staying close to the full GPT-5.4 on key benchmarks. GPT-5.4 nano takes that further, focusing on simpler tasks like classification and data extraction where efficiency matters most.
This approach fits apps where speed shapes the experience. Coding assistants, background agents, and real-time vision tools depend on quick feedback, and in those cases a slightly smaller model often delivers a better overall result.
The performance gap between models is narrower than you might expect. GPT-5.4 mini scores 54.4 percent on SWE-Bench Pro, compared to 57.7 percent for the full model. On OSWorld-Verified, the mini reaches 72.1 percent while the larger version hits 75 percent, keeping the difference tight across tasks.
Costs drop far more dramatically. GPT-5.4 mini is priced at $0.75 per million input tokens and $4.50 per million output tokens, while nano comes in at $0.20 and $1.25. Both models support text and image inputs, tool use, function calling, and a 400,000 token context window, so the lower price doesn’t strip away core capabilities.
In Codex, the mini model uses just 30 percent of the GPT-5.4 quota. That lets developers shift routine coding work to a cheaper tier while saving the full model for harder reasoning.
OpenAI is also pushing a multi-model workflow. Instead of relying on one system, developers can split work across tiers, pairing a larger model for planning with smaller ones handling execution.
That setup reflects how many real apps already behave. One model can review a codebase or decide on changes, while another processes supporting data or repetitive steps. The smaller model handles the predictable work, while the larger one focuses on judgment and coordination.
Early feedback suggests this mix is effective. Hebbia CTO Aabhas Sharma reported that GPT-5.4 mini matched or outperformed competing models on several tasks at a lower cost, and in some cases even delivered stronger end-to-end results than the full GPT-5.4.
GPT-5.4 mini is now available across the API, Codex, and ChatGPT. Free and Go users can access it through the Thinking option, while other users may see it as a fallback when they hit limits on GPT-5.4 Thinking.
The nano model is currently limited to the API, aimed at teams running high-volume workloads where cost control is critical. Both models are live today with full documentation available.
For developers building real-time AI features, the shift is clear. Smaller models are now capable enough to handle a larger share of everyday work, which makes choosing the right balance of speed, cost, and capability an increasingly practical decision.
Related Posts
Acer reveals Veriton compact PC to tackle the Mac mini with AMD Ryzen and plenty of AI mojo
Acer is making a direct play in that space with the Veriton RA110 AI Mini Workstation, a compact desktop that runs on AMD's Ryzen AI Max+ 395 processor, aimed at the same desk-bound professional who wants power without the tower.
Acer’s Swift Air 14 is a peppy MacBook Neo rival with some cool upgrades and a $699 ask
At a time when even mainstream laptops are creeping toward four-figure price tags, Acer’s latest machine feels refreshingly straightforward. It’s aimed at students, remote workers, and anyone who wants a laptop that looks and feels expensive without draining their bank account. The Swift Air 14 is powered by Intel’s new Core Series 3 processors and delivers up to 19 hours of battery life. That’s the sort of endurance that could realistically get many users through a full workday and beyond without scrambling for a charger.
Google Drive can now batch-scan your documents and spare you a few other frustrations, too
Well, Google Drive's new document scanner redesign fixes all three problems at once. Announced by Sameer Samat, the President of Android Ecosystem at Google, the feature is now rolling out for Android users.