
OpenAI confirmed a widespread outage affecting ChatGPT users globally on June 10, 2025, with services disrupted for approximately nine hours. The incident, resolved by 6:30 PM ET, stemmed from server overload due to misconfigured rate-limiting during peak traffic, exacerbated by CDN failures and local storage corruption1. This analysis breaks down the technical causes, regional impacts, and mitigation strategies relevant to infrastructure security.
Service Impact and Root Cause
The outage rendered ChatGPT inaccessible for free-tier users while causing 40% latency for Plus/Enterprise subscribers. Forensic reports identified corrupted widget-session
tokens in HTML Local Storage and conflicts with persistent GDPR cookies as primary failure points2. The Sora video generation service also experienced GPU allocation bottlenecks, triggering render timeout
errors. Regional data highlights disproportionate effects in India (88% login failures) and the EU (75% cookie expiry conflicts), as shown below:
Region | User Complaints | Primary Issues |
---|---|---|
India | 88% | Login loops from corrupted session tokens |
EU | 75% | GDPR cookie persistence conflicts |
Technical Breakdown
Local storage corruption affected 200+ tokens, including critical session identifiers like REACT_QUERY_OFFLINE_CACHE
. Third-party cookies such as GDPR_Settings
(with a 10-year expiry) clashed with session management systems, blocking API calls1. The following JavaScript hotfix was deployed to resolve cookie conflicts:
document.cookie = "GDPR_Settings=; expires=Thu, 01 Jan 1970 00:00:00 UTC; path=/;";
Mitigation steps included rolling back rate-limiting changes at 10:30 AM ET and deploying backup Cloudflare CDN routes by 12:15 PM ET. Full service restoration required patching local storage handlers and disabling conflicting cookies via the above script2.
Security Implications
The incident underscores risks in session persistence mechanisms and third-party cookie dependencies. Organizations should audit token storage implementations and validate CDN failover procedures. For teams managing similar infrastructures, consider these steps:
- Implement token checksum validation for local storage entries
- Test rate-limiting configurations under simulated peak loads
- Monitor cookie expiration policies for compliance conflicts
Economic losses were estimated at $2.1M in API revenue, highlighting the operational impact of such outages1. Historical context reveals this as part of a pattern, following an AWS S3 failure in December 2024 that cascaded to inference clusters.
OpenAI’s post-mortem acknowledged gaps in alerting for local storage failures, a critique echoed by developers relying on ChatGPT’s API2. Future improvements may include real-time monitoring for token corruption and granular outage notifications.
References
- “ChatGPT Down for Three Hours: Here’s What We Know.” Decrypt, 10 June 2025.
- OpenAI Post-Mortem Report. The Verge, 10 June 2025.