OpenAI released GPT-5.4 mini and GPT-5.4 nano, calling them its most capable small models yet and positioning them for faster, more efficient high-volume workloads.
This is the part of model progress that rarely gets the clean keynote line but matters immediately in production: smaller models that are good enough, quick enough, and cheap enough to run constantly without turning every workflow into a budget meeting.
Source credit: OpenAI's original source material.
Mini is for responsive work, nano is for volume
OpenAI says GPT-5.4 mini improves over GPT-5 mini across coding, reasoning, multimodal understanding, and tool use while running more than twice as fast. GPT-5.4 nano is the smallest and cheapest option, aimed at classification, extraction, ranking, and simpler coding subagent tasks.
In the API, GPT-5.4 mini supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills. OpenAI lists a 400k context window and pricing of $0.75 per million input tokens and $4.50 per million output tokens. Nano is API-only at $0.20 per million input tokens and $1.25 per million output tokens.
The subagent hint is the real product clue
OpenAI explicitly frames mini as useful inside systems where a larger model handles planning and judgment while smaller models execute narrower subtasks in parallel. In Codex, GPT-5.4 mini uses only 30% of the GPT-5.4 quota and can be used by subagents for less reasoning-intensive work.
That is the architecture to watch. Agent systems are starting to look less like one model in a chat box and more like a small office: one coordinator, several specialists, a few cheap interns doing search and cleanup, and hopefully nobody setting the repo on fire.
- targeted edits and codebase navigation
- classification and extraction at scale
- screenshot interpretation for computer-use systems
- parallel subagent work where latency matters
The practical move is to retest workflows that were too expensive or sluggish on larger models. If mini handles supporting work cleanly, teams can reserve the flagship model for the moments where judgment actually matters.
That is less flashy than a leaderboard jump. It is also how AI systems become affordable enough to run all day.
In short
OpenAI’s smaller GPT-5.4 models are built for fast, high-volume work. The important part is not that they are cute and tiny. It is that agent systems increasingly need cheap workers, not one expensive genius doing everything.