OpenAI Implements 3,000-Per-Week Limit for GPT-5's Thinking Mode Following User Backlash

OpenAI has introduced a 3,000-per-week usage limit for GPT-5’s “Thinking” mode, responding to widespread criticism about the model’s initial restrictive token limits. This change comes after users reported performance issues and protested the removal of legacy models from ChatGPT’s interface.¹

TL;DR: Key Points for Security Professionals

GPT-5’s “Thinking” mode initially launched with a 200 messages/week limit, now increased to 3,000
Performance benchmarks show 74.9% on SWE-bench coding tests and 94.6% on AIME math problems
Enterprise version supports 128K context window with Gmail/Calendar integration
45% reduction in hallucinations compared to previous models
API now includes reasoning_effort and verbosity parameters for developers

Technical Specifications and Performance

GPT-5, released on August 7, 2025, combines capabilities from GPT-4o, GPT-4.1, and o3 variants into a unified model.² The system automatically switches between “Chat” (optimized for speed) and “Thinking” (designed for complex reasoning) modes based on query complexity. Benchmark tests demonstrate significant improvements over previous versions, including 74.9% accuracy on SWE-bench coding challenges and 94.6% on AIME math problems without external tools.³

The model’s multimodal capabilities scored 84.2% on MMMU visual reasoning tests, while safety improvements reduced hallucinations by 45%. OpenAI implemented “safe completions” that provide nuanced answers instead of outright refusals when encountering sensitive queries.⁴

User Backlash and Adjustments

Initial rollout faced significant criticism from technical users, particularly regarding the removal of GPT-4o and o3 variants from the standard interface. Performance complaints emerged on OpenAI’s community forums, with users noting slower response times in coding and mathematical operations compared to GPT-4.1.⁵

The original 200 messages/week limit for “Thinking” mode proved particularly controversial among power users and developers. OpenAI responded by increasing this limit fifteen-fold to 3,000 messages per week while restoring legacy model access for Plus subscribers.⁶ The company also added clearer UI indicators showing the active mode (e.g., “GPT-5 Thinking”) to help users understand system behavior.

Enterprise and API Features

GPT-5 offers tiered access levels with context windows ranging from 8K (free tier) to 128K (Pro/Enterprise). The API supports up to 400K tokens, with new parameters including reasoning_effort and verbosity for fine-tuned control.⁷ Enterprise features include Gmail and Calendar integration for automated email drafting and scheduling, along with custom tool creation using plain-text constraints.

Security teams should note the model’s improved ability to analyze large documents, though testing revealed limitations with 167-page PDFs. The system successfully generated a 764-line pixelated dinosaur game with pause functionality during evaluations, demonstrating advanced code generation capabilities.⁸

Security Implications and Recommendations

The expanded context window (up to 128K) presents both opportunities and risks for security applications. While enabling more comprehensive analysis of logs and threat intelligence, the increased complexity also raises the potential for more sophisticated hallucinations. Organizations implementing GPT-5 should:

Monitor outputs for consistency when processing security-related data
Establish clear usage policies for “Thinking” mode given the weekly limit
Validate code generation outputs before deployment in production environments
Consider API rate limits when integrating with security tooling

OpenAI has not disclosed specific details about the computational resources required for “Thinking” mode, though infrastructure strain became apparent when usage reached 24% of total queries.⁹ This suggests organizations should plan for potential latency during peak periods when incorporating GPT-5 into security workflows.

Conclusion

OpenAI’s adjustments to GPT-5’s usage limits reflect the challenges of balancing performance, cost, and user expectations in advanced AI systems. The 3,000-per-week cap on “Thinking” mode represents a compromise between resource constraints and professional user needs. As organizations evaluate GPT-5 for security applications, they should consider both its technical capabilities and the operational constraints revealed during this rollout.

The model’s improved accuracy and expanded context windows offer potential benefits for threat analysis and code review, while the tiered access structure allows organizations to match capability levels with security requirements. Future developments will likely focus on optimizing resource usage while maintaining the quality gains demonstrated in GPT-5’s benchmark results.