gekro
GitHub LinkedIn
← News Feed
AI News

MiniMax M3 open-weight model claims SWE-Bench Pro lead with 1M-token context

MiniMax released M3 on June 1, the company’s first model positioned as an open-weight alternative competitive with frontier closed-source systems (MiniMax Blog). The model combines a 1-million-token context window, native text, image, and video inputs, and agentic coding capabilities built on the MiniMax Sparse Attention (MSA) architecture, which the company says achieves more than 9x faster prefill and more than 15x faster decoding at the 1M-token limit by reducing per-token compute to approximately one-twentieth of the prior generation (The Decoder). MiniMax reports 59.0% on SWE-Bench Pro, claiming the score exceeds GPT-5.5 and Gemini 3.1 Pro (MiniMax Blog); independent verification is not yet available, as several runs were conducted on MiniMax’s own infrastructure using proprietary agent scaffolding (TechTimes). The API is live now via MiniMax’s platform, with weights and a technical report due on Hugging Face within approximately ten days of launch (The Decoder).

The Trump administration signed an executive order on June 2 creating a voluntary pre-release review process for frontier AI models, under which developers may submit models for government evaluation up to 30 days before public release (NPR). The order directs agencies to build an AI cybersecurity clearinghouse - a shared vulnerability database coordinated with industry and critical infrastructure operators - and tasks the NSA with setting the benchmark thresholds that determine which models qualify as covered frontier models subject to review; the EO expressly prohibits any mandatory licensing or preclearance requirement for model distribution (White House, The Register). Separately, OpenAI on May 29 launched Rosalind Biodefense, a sponsored-access program giving vetted developers and select government partners use of its GPT-Rosalind reasoning model for life-sciences workloads including epidemiological modeling, outbreak detection, and diagnostic development (OpenAI, Axios).