DNYUZ
  • Home
  • News
    • U.S.
    • World
    • Politics
    • Opinion
    • Business
    • Crime
    • Education
    • Environment
    • Science
  • Entertainment
    • Culture
    • Music
    • Movie
    • Television
    • Theater
    • Gaming
    • Sports
  • Tech
    • Apps
    • Autos
    • Gear
    • Mobile
    • Startup
  • Lifestyle
    • Arts
    • Fashion
    • Food
    • Health
    • Travel
No Result
View All Result
DNYUZ
No Result
View All Result
Home News

Microsoft’s new rStar-Math technique upgrades small models to outperform OpenAI’s o1-preview at math problems

January 9, 2025
in News
Microsoft’s new rStar-Math technique upgrades small models to outperform OpenAI’s o1-preview at math problems
523
SHARES
1.5k
VIEWS
Share on FacebookShare on Twitter

Microsoft is doubling down on the potential of small language models (SLMs) with the unveiling of rStar-Math, a new reasoning technique that can be applied to small models to boost their performance on math problems with reasoning techniques — similar to, and in some cases exceeding — the performance of OpenAI’s o1-preview model.

While still in a research phase — as outlined in a paper published on pre-review site arXiv.org and credited to eight authors at Microsoft or Peking University and Tsinghua University in China — the technique was applied to several different smaller open source models including Microsoft’s own Phi-3 mini, Alibaba’s Qwen-1.5B (a 1.5-billion parameter model), and Qwen-7B (a 7-billion parameter model), and showed improved performance on all of them, even exceeding OpenAI’s previously most advanced model at the MATH (word problem solving) third-party benchmark of 12,500 questions covering various branches such as geometry and algebra, and all levels of difficulty.

Ultimately, according to a post on Hugging Face, the researchers plan to make their code and data available on Github at https://github.com/microsoft/rStar, though one of the paper’s authors, Li Lyna Zhang, wrote in the comments on the Hugging Face post that the team is “still undergoing the internal review process for open-source release.” As such, “the repository remains private for now. Please stay tuned!”

Community members expressed enthusiasm, calling the innovations “impressive” and praising the blend of Monte Carlo Tree Search (MCTS) with step-by-step reasoning. One commenter highlighted the simplicity and utility of using Q-values for step scoring, while others speculated on future applications in geometric proofs and symbolic reasoning.

This news follows closely on the heels of the open-sourcing of Microsoft’s Phi-4 model, a smaller 14-billion-parameter AI system now available on Hugging Face under the permissive MIT license.

While the Phi-4 release has expanded access to high-performance small models, rStar-Math showcases a specialized approach: using smaller AI systems to achieve state-of-the-art results in mathematical reasoning.

rStar-Math works by using several different models and components to help a target small model ‘self-evolve’

The key to rStar-Math is that it leverages Monte Carlo Tree Search (MCTS), a method that mimics human “deep thinking” by iteratively refining step-by-step solutions to mathematical problems.

The researchers used MCTS because it “breaks down complex math problems into simpler single-step generation tasks, reducing the difficulty” for smaller models.

However, they didn’t just apply MCTS as other researchers in the past have done. Instead, in a stroke of brilliance, they also ask the model they trained to always output its “chain-of-thought” reasoning steps as both natural language descriptions and Python code.

They mandated the model would include the natural language responses as Python code comments, and only those outputs using Python would be used to train the model.

The researchers also trained a “policy model” to generate math reasoning steps and a Process Preference Model (PPM) to select the most promising steps to answering the problems, and improved them both over four rounds of “self-evolution,” with both models improving each other.

For their starting data, the researchers said they used “747,000 math word problems from publicly available sources,” along with their solutions, but generated new steps for solving them with the two models described above.

Record-Breaking Results

After four rounds of self-evolution, rStar-Math achieved significant milestones:

• On the MATH benchmark, the accuracy of the Qwen2.5-Math-7B model jumped from 58.8% to 90.0%, outperforming OpenAI o1-preview.

• On the American Invitational Mathematics Examination (AIME), it solved 53.3% of problems, placing among the top 20% of high school competitors.

These results highlight the power of SLMs in handling complex mathematical reasoning, traditionally dominated by larger systems.

Smaller is better?

In recent years, AI innovation has largely been driven by scaling up language models, with increasing parameters seen as a way to improve performance. Yet, the high costs associated with these massive models, from computational resources to energy consumption, have raised questions about scalability.

Microsoft is offering an alternative path, focusing on efficiency. The release of rStar-Math further underscores this commitment by demonstrating how SLMs can rival—and in some cases exceed—the capabilities of their larger counterparts.

Microsoft’s dual releases of Phi-4 and rStar-Math paper suggest that compact, specialized models can provide powerful alternatives to the industry’s largest systems

Moreover, by outperforming larger competitors in key benchmarks, these models challenge the notion that bigger is always better. They open doors for mid-sized organizations and academic researchers to access cutting-edge capabilities without the financial or environmental burden of massive models.

The post Microsoft’s new rStar-Math technique upgrades small models to outperform OpenAI’s o1-preview at math problems appeared first on Venture Beat.

Share209Tweet131Share
Sophie Turner Just Made This ‘Awkward’ Hair Length the Cut of the Summer
News

Sophie Turner Just Made This ‘Awkward’ Hair Length the Cut of the Summer

by Glamour
May 23, 2025

Sophie Turner’s summer haircut just boosted my confidence by at least 20%. Most women will know what I mean when ...

Read more
Culture

Divorce by Murder: How Jennifer Dulos’s Death Shook a Generation Already Freaked Out by Mom and Dad

May 23, 2025
News

Omeda Studios announces Predecessor esports summer tournaments | The DeanBeat

May 23, 2025
News

May 27’s New Moon Most Affects These Zodiac Signs

May 23, 2025
News

Why Isn’t Mike Birbiglia More Famous?

May 23, 2025
Elon Musk Has a Sinister New Plan to Expand His Power in Government

Elon Musk Has a Sinister New Plan to Expand His Power in Government

May 23, 2025
Authorities Called After Britney Spears Lights Cig on Flight

Authorities Called After Britney Spears Lights Cig on Flight

May 23, 2025
Teen daughter of murder victim writes children’s book series

Teen daughter of murder victim writes children’s book series

May 23, 2025

Copyright © 2025.

No Result
View All Result
  • Home
  • News
    • U.S.
    • World
    • Politics
    • Opinion
    • Business
    • Crime
    • Education
    • Environment
    • Science
  • Entertainment
    • Culture
    • Gaming
    • Music
    • Movie
    • Sports
    • Television
    • Theater
  • Tech
    • Apps
    • Autos
    • Gear
    • Mobile
    • Startup
  • Lifestyle
    • Arts
    • Fashion
    • Food
    • Health
    • Travel

Copyright © 2025.