China Weibo Launches VibeThinker 1.5B: $7,800 AI Model Beats DeepSeek R1 in Math Reasoning

Weibo releases open-source VibeThinker 1.5B on November 11, 2025, trained for just $7,800 with MIT license. Outperforms DeepSeek R1 on AIME24 math benchmark using Spectrum-to-Signal training.

Key Highlights

Weibo AI released VibeThinker 1.5B on November 11, 2025, a 1.5 billion parameter open-source reasoning model
Post-training cost only $7,800, which is 30 to 60 times cheaper than DeepSeek R1 ($294K) and MiniMax-M1 ($535K)
Achieves 80.3 on AIME24 math benchmark, surpassing DeepSeek R1’s 79.8 despite having 400 times fewer parameters
Available free under MIT license on Hugging Face, GitHub, and ModelScope for commercial use
Uses Spectrum-to-Signal Principle (SSP) training framework with diversity-driven optimization
Scores 74.4 on AIME25 and 50.4 on HMMT25, outperforming models hundreds of times larger
Built on Alibaba’s Qwen2.5-Math-1.5B base with innovative two-stage post-training methodology
Achieves 55.9 on LiveCodeBench v5 and 51.1 on v6, beating Magistral Medium’s 50.3

What Makes VibeThinker 1.5B Different from Large AI Models

VibeThinker 1.5B challenges the AI industry’s core assumption that bigger models always deliver better reasoning. While competitors like DeepSeek R1 use 671 billion parameters and Kimi K2 exceeds 1 trillion parameters, this compact model achieves comparable or superior performance with just 1.5 billion parameters. The efficiency gap is staggering: VibeThinker delivers 100 to 600 times smaller architecture while matching giants on competitive mathematics and coding tasks.

The model demonstrates that parameter count alone doesn’t determine reasoning capability. Traditional scaling approaches throw more computational power at problems, assuming size equals intelligence. VibeThinker proves that smarter training methods can extract remarkable performance from modest architectures. This efficiency matters beyond academic benchmarks because smaller models deploy on edge devices like smartphones and reduce inference costs by 20 to 70 times compared to massive models.

Weibo’s approach makes advanced AI accessible to researchers and developers without access to supercomputer-scale resources. The $7,800 post-training budget represents a fundamental shift in how the industry thinks about model development. Instead of racing toward trillion-parameter systems that only tech giants can afford, VibeThinker shows that algorithmic innovation can democratize cutting-edge AI capabilities.

How Does the Spectrum-to-Signal Training Method Work

The Spectrum-to-Signal Principle (SSP) framework divides post-training into two distinct phases with different optimization goals. During supervised fine-tuning (SFT), the model maximizes solution diversity across potential correct answers rather than optimizing purely for single-answer correctness. This “spectrum phase” encourages the model to explore multiple valid reasoning paths, improving its Pass@K score where K represents multiple solution attempts.

After establishing diverse reasoning pathways, the reinforcement learning stage shifts focus to signal optimization. The MaxEnt-Guided Policy Optimization (MGPO) framework reinforces correct solutions while maintaining the exploratory diversity developed during SFT. This two-stage approach systematically integrates diversity as the central design principle, enabling small models to achieve robust performance that surpasses conventional training paradigms.

Traditional training methods often collapse toward single solution pathways, limiting a model’s reasoning flexibility. SSP prevents this collapse by explicitly rewarding diverse correct approaches during early training, then sharpening those pathways through targeted reinforcement. The result is a model that can tackle problems from multiple angles, adapting its strategy based on question complexity rather than following rigid solution templates.

Why Does VibeThinker Cost Only $7,800 to Train

The remarkably low training cost stems from VibeThinker’s compact 1.5 billion parameter architecture combined with efficient training techniques. The entire post-training process consumed approximately 3,900 GPU hours on NVIDIA H800 processors. By comparison, DeepSeek R1 required $294,000 in post-training expenses, while MiniMax-M1 cost $535,000. VibeThinker achieves comparable performance at 1/30 to 1/60 of these figures.

The cost efficiency comes from strategic choices at every development stage. Starting with Alibaba’s Qwen2.5-Math-1.5B as a base saved pre-training resources while providing a strong mathematical foundation. The SSP framework maximizes learning from each training example through diversity exploration, reducing the total data and compute requirements. Smaller parameter counts also mean faster iteration cycles, allowing researchers to test and refine approaches without massive infrastructure investments.

This breakthrough fundamentally changes the economics of developing high-performance reasoning models. Research teams at universities, startups, and smaller companies can now experiment with cutting-edge AI techniques without multi-million-dollar budgets. The low barrier to entry encourages more diverse perspectives and innovations in the field, potentially accelerating progress beyond what centralized research at tech giants can achieve alone.

How Well Does VibeThinker Perform on Math and Coding Benchmarks

VibeThinker 1.5B delivers standout performance across competitive mathematics and programming challenges. On AIME24, the prestigious American Invitational Mathematics Examination, it scores 80.3 compared to DeepSeek R1’s 79.8. For context, DeepSeek R1 has 671 billion parameters, making it 447 times larger. VibeThinker also achieves 74.4 on AIME25 and 50.4 on HMMT25, substantially outperforming its size peers and matching much larger commercial models.

In coding tasks, the model scores 55.9 on LiveCodeBench v5 and 51.1 on v6. The v6 score edges past Magistral Medium’s 50.3, demonstrating strong practical programming abilities. Notably, the base Qwen2.5-Math-1.5B model scored 0.0 on both LiveCodeBench versions before Weibo’s post-training, highlighting how SSP unlocks capabilities that raw architecture alone cannot provide.

Compared to other small models, VibeThinker dominates its weight class. It more than doubles SmolLM 3B’s AIME25 score (74.4 vs 36.7) and maintains similar margins on HMMT25 (50.4 vs 26.0) and LiveCodeBench v5 (55.1 vs 27.6). Against Qwen3-1.7B, VibeThinker shows substantial advantages on AIME25 (74.4 vs 36.8) and LiveCodeBench v6 (51.1 vs 26.9), establishing itself as the most capable reasoning model under 3 billion parameters.

What Are the Practical Applications for Developers

VibeThinker’s compact size and strong reasoning enable deployment scenarios impossible for massive models. Developers can run the model on consumer hardware with 8GB or more VRAM, bringing advanced AI capabilities to edge devices, mobile applications, and resource-constrained environments. This opens possibilities for offline AI assistants, embedded systems in vehicles, and educational tools that work without constant cloud connectivity.

The MIT license removes commercial barriers, allowing startups and enterprises to integrate VibeThinker into products without licensing fees or usage restrictions. Companies building tutoring apps, coding assistants, or problem-solving tools can leverage production-ready reasoning at minimal infrastructure cost. The 20 to 70 times lower inference costs compared to large models make continuous operation economically viable for applications serving millions of users.

Researchers gain a transparent platform for studying reasoning mechanisms in language models. The complete release includes model weights, training code, evaluation scripts, and CUDA kernels. This openness enables academic teams to validate claims, experiment with variations of the SSP framework, and contribute improvements back to the community. The low training cost also makes it feasible for university labs to fine-tune versions for specialized domains without requiring industry-scale compute budgets.

How Can You Start Using VibeThinker 1.5B Today

Accessing VibeThinker requires just a few straightforward steps. Visit the WeiboAI GitHub repository or find the model on Hugging Face and ModelScope platforms. The repository provides detailed setup instructions, including environment configuration and dependency installation. Developers can choose between using the pre-trained model for immediate inference or fine-tuning it on custom datasets using provided guides.

For quick testing, load the model using the Transformers library with the recommended parameters: temperature 0.6 or 1.0, max token length 40,960, top_p 0.95, and top_k -1. These settings balance creativity with reliability for competitive mathematics and coding problems. The model supports standard chat interfaces, making integration into existing applications straightforward. Sample code demonstrates initialization, prompt formatting, and response handling.

Evaluation tools help verify performance claims independently. The GitHub repository includes math and code evaluation programs with benchmark datasets like AIME, HMMT, and LiveCodeBench. Researchers can reproduce reported scores or test the model on custom problem sets. Community forums and issue trackers provide support channels where developers share experiences, optimizations, and novel applications as the ecosystem grows around this efficient reasoning model.

Ainewshub

Discover more from AI News Hub

Subscribe to get the latest posts sent to your email.

Key Highlights

What Makes VibeThinker 1.5B Different from Large AI Models

How Does the Spectrum-to-Signal Training Method Work

Why Does VibeThinker Cost Only $7,800 to Train

How Well Does VibeThinker Perform on Math and Coding Benchmarks

What Are the Practical Applications for Developers

How Can You Start Using VibeThinker 1.5B Today

Ainewshub

Share this:

Discover more from AI News Hub

Related Posts

Share this:

Share this:

Share this:

Leave a Reply Cancel reply