model June 18, 2026

VibeThinker-3B: Small‑Scale Reasoning Powerhouse for Math, Code, and STEM

VibeThinker-3B, released by WeiboAI, is a 3‑billion‑parameter language model built on top of Qwen/Qwen2.5-Coder-3B. It targets verifiable reasoning tasks such as mathematics, competitive programming, and STEM‑focused problem solving. The model follows the Spectrum-to-Signal Principle (SSP) training pipeline, combining curriculum‑based supervised fine‑tuning, multi‑domain reasoning reinforcement learning, offline self‑distillation, and instruct‑RL to produce strong multi‑step reasoning and answer‑verification capabilities. The README emphasizes that the model was not trained on tool‑calling or agent‑based programming data, so it is best suited for tasks where the correct answer can be independently verified.

Benchmark results highlighted in the technical report show VibeThinker-3B achieving 76.4 % accuracy on IMO‑AnswerBench (rising to 80.6 % with Claim‑Level Reliability Assessment) and a 96.1 % acceptance rate on recent LeetCode contests, placing it in the performance range of much larger frontier models. Its strengths lie in compact, parameter‑dense reasoning rather than broad open‑domain knowledge, making it an attractive option for developers who need high‑quality math or code generation without the overhead of multi‑hundred‑billion‑parameter models.

The model is distributed as a Transformers‑compatible checkpoint with safetensors, and inference can be accelerated with vLLM or SGLang. Usage guidelines recommend a temperature of 1.0, top‑p of 0.95, and a 64K context window to preserve long‑horizon reasoning trajectories. Licensed under MIT, VibeThinker-3B is positioned as a research tool for exploring the limits of small language models on structured, verifiable tasks.

Project Ideas

  1. Create a competitive‑programming assistant that takes LeetCode prompts and returns step‑by‑step Python solutions, leveraging the model's strong performance on contest problems.
  2. Build an interactive math tutoring chatbot that explains solutions to Olympiad‑level problems, using the model's verified reasoning ability to generate and check each step.
  3. Integrate VibeThinker-3B into a code‑generation IDE plugin for generating algorithmic snippets that can be automatically tested against sample inputs.
  4. Develop a benchmark suite for new STEM reasoning tasks and evaluate VibeThinker-3B’s performance compared to larger models using the recommended vLLM settings.
  5. Design a verification‑first pipeline where the model produces candidate answers for scientific calculations and then self‑checks them using its claim‑level reliability assessment.
← Back to all reports