OpenAI Launches GPT-4.1 New Models: Stronger encoding capabilities at lower prices

open ai gpt 4.1

OpenAI has officially unveiled its new models under the GPT-4.1 family, introducing OpenAI GPT4.1, GPT-4.1 mini, and GPT-4.1 nano — each optimized for real-world programming tasks and instruction-following. Despite a more “literal” response tendency, GPT4.1 excels when given clear, specific prompts. It also leads in video content understanding, scoring 72% in the "long, no subtitles" category in the Video-MME benchmark. These OpenAI new models are accessible via OpenAI API and Azure OpenAI platforms.

What is GPT-4.1 Series?

GPT-4.1 is a new generation of AI language models released by OpenAI on April 15, 2025, consisting of the flagship model GPT-4.1, the highly efficient mini-model GPT-4.1 mini, and the ultra-small model GPT-4.1 nano, which marks another major upgrade in AI technology commercialization and developer ecosystems, with breakthroughs in performance, cost, and responsiveness, and is available only through APIs.

GPT4.1 Key Hightlights

openai gpt4.1

Performance Leap

GPT-4.1: Best performance in encoding, instruction compliance and long text comprehension, supports very long context windows of 1 million tokens for complex tasks. Scores 54.6% in SWE-bench tests, 21.4% better than GPT-4o and 26.6% better than GPT-4.5.
GPT-4.1 mini: Nearly half the latency, 83% lower cost, and outperforms GPT-4o in multiple benchmarks, making it suitable for performance and efficiency-first scenarios.
GPT-4.1 nano: fastest and lowest cost, supports 1 million tokens context window, 80.1% MMLU score, 50.3% GPQA score, suitable for low-latency tasks such as sorting and completing.

Cost Optimization: With Highly Competitive Pricing

The GPT-4.1 series performs well in a number of benchmarks while being competitively priced. For example, GPT-4.1 nano costs only 12 cents per million tokens, significantly lowering the barrier to entry for developers.
Multimodal and Long Text Capabilities
GPT-4.1 nano supports 1 million tokens of ultra-long context processing, which is suitable for tasks such as analyzing large code bases and reviewing multiple documents. In a large number of tests, GPT-4.1 accurately retrieves ultra-long context information and outperforms GPT-4o.

GPT 4.1 Serises Technical Breakthroughs

Programming Ability

Boasting a massive 1-million-token context window — capable of handling around 750,000 words — GPT-4.1 surpasses GPT-4o in performance and aims to serve developers looking to build robust AI coding agents. OpenAI describes this release as a significant step toward creating an “agentic software engineer,” capable of handling everything from coding and testing to documentation.

In the SWE-bench Verified benchmark test, GPT-4.1 scored 54.6%, 21.4 percentage points higher than GPT-4o. The code generation speed is 40% faster than GPT-4o, and the cost of user input query is reduced by 80%.

Instruction Compliance and MultiChallenge

In the MultiChallenge test, GPT-4.1 scores 10.5 percentage points better than GPT-4o. In real conversations, especially in multi-round interaction tasks, the model consistently outputs modified segments, saving latency and cost.

Multimodal Long Context Processing

In the Video-MME's unscripted long video category, GPT-4.1 set a new record with a score of 72.0%, 6.7 percentage points ahead of GPT-4o.

OpenAI GPT4.1 Pricing & Model Variants

OpenAI GPT4.1
image source: OpenAI

Each model in the GPT-4.1 family balances speed, efficiency, and cost:

GPT-4.1: $2 per million input tokens / $8 per million output tokens
GPT-4.1 mini: $0.40 per million input / $1.60 per million output
GPT-4.1 nano: $0.10 per million input / $0.40 per million output

This OpenAI pricing strategy gives developers flexibility based on their performance and budget needs. The nano model, OpenAI’s fastest and most affordable yet, is ideal for lightweight tasks.

Developer Support

API Priority

The GPT-4.1 series is available only through APIs, and developers can access these models and integrate them into their applications through OpenAI's API platform. Azure has officially launched the newest GPT model lineup—GPT-4.1, GPT-4.1-mini, and GPT-4.1-nano—now accessible through Microsoft Azure OpenAI Service and GitHub.

Ecological Extensions

OpenAI has increased the output token limit of GPT-4.1 to 32768 tokens, which is convenient to cope with the demand of full-file rewriting. At the same time, the hint caching mechanism has been optimized and the discount has been increased from 50% to 75% to further reduce the cost of use.

Future: OpenAI Quasar Program

It is reported that OpenAI is working on the realization of an “agentic software engineer” capable of all-around programming, with a long-term vision of one-stop application development, quality inspection, vulnerability repair, and documentation. More research and iterations in the future (openai quasar) are expected to continuously optimize openai gpt, make up for the shortcomings of the existing technology, and form a broader application ecosystem in the industry.

Discover how Scifocus empowers students and researchers to harness the full potential of cutting-edge models like GPT-4.1 for faster, smarter academic writing and research.

The Shortcomings of the GPT4.1 Series

Despite good results on various coding and video understanding tasks (e.g., 72% accuracy in Video-MME test), GPT-4.1 still has challenges. The model is prone to accuracy degradation when dealing with large amounts of token data, dropping from 84% accuracy at 8,000 tokens to only 50% at 1 million tokens. In addition, the new model is sometimes too “literal” in understanding and answering questions, requiring users to provide more explicit and detailed instructions.