MrLight's picture
Update README.md
e05e2d4 verified
---
license: apache-2.0
language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
base_model:
- Qwen/Qwen2.5-7B
tags:
- General-Reasoner-7B
---
# General-Reasoner: Advancing LLM Reasoning Across All Domains
<p align="center">
<a href="https://github.com/TIGER-AI-Lab/General-Reasoner" target="_blank">💻 Code</a> |
<a href="https://arxiv.org/abs/2505.14652" target="_blank">📄 Paper</a> |
<a href="https://huggingface.co/datasets/TIGER-Lab/WebInstruct-verified" target="_blank">📊 Dataset</a> |
<a href="https://huggingface.co/collections/TIGER-Lab/general-reasoner-67fe9386e43e046489eac013" target="_blank">🤗 Model</a> |
<a href="https://tiger-ai-lab.github.io/General-Reasoner/" target="_blank">🌐 Project Page</a>
</p>
## Overview
<p align="center">
<img src="https://tiger-ai-lab.github.io/General-Reasoner/static/images/teaser.png" alt="General-Reasoner Teaser" width="650"/>
</p>
<p align="center" style="font-style: italic; font-size: 0.95rem;">
<em>
Figure: Effectiveness of <strong>General-Reasoner</strong> trained with diverse verifiable reasoning questions using model-based verifier compared to baseline methods on various reasoning tasks.
</em>
</p>
**General-Reasoner** is a training paradigm for large language models (LLMs), designed to robustly enhance reasoning abilities across diverse domains—not just mathematics and coding, but also physics, chemistry, finance, humanities, and more.
**Key features:**
- **Zero RL Training:** Direct reinforcement learning from base LLMs, bypassing intermediate supervised stages.
- **Diverse Reasoning Data:** 230K+ high-quality, verifiable questions sourced from the web and filtered for answer verifiability across disciplines.
- **Model-Based Verifier:** Compact 1.5B generative verifier model for context-aware, chain-of-thought answer validation, outperforming traditional rule-based methods.
**This specific model is the General-Reasoner variant trained based on [Qwen2.5-7B-Base](https://huggingface.co/Qwen/Qwen2.5-7B).**
## Main Results
General-Reasoner outperforms base and supervised models on a variety of reasoning benchmarks, demonstrating robust generalization across domains:
<p align="center">
<a href="https://github.com/TIGER-AI-Lab/General-Reasoner/raw/refs/heads/gh-pages/static/images/results_general.png" target="_blank">
<img src="https://github.com/TIGER-AI-Lab/General-Reasoner/raw/refs/heads/gh-pages/static/images/results_general.png" alt="Main Results" width="600">
</a>
</p>
## Citation
If you feel our work is helpful, please cite:
```bibtex
@article{general-reasoner,
title={{G}eneral-{R}easoner: Advancing LLM Reasoning Across All Domains},
author={Xueguang Ma and Qian Liu and Dongfu Jiang and Ge Zhang and Zejun Ma and Wenhu Chen},
year={2025},
journal={arXiv:2505.14652},
url={https://arxiv.org/abs/2505.14652}
}
```