-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 37 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2502.06703
-
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 63 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Scaling Laws for Precision
Paper • 2411.04330 • Published • 8 -
Transcending Scaling Laws with 0.1% Extra Compute
Paper • 2210.11399 • Published
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 123 -
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding
Paper • 2502.08946 • Published • 191 -
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
Paper • 2502.07346 • Published • 54 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 152
-
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU
Paper • 2502.08910 • Published • 148 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 152 -
The Curse of Depth in Large Language Models
Paper • 2502.05795 • Published • 40
-
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 142
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 115 -
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper • 2501.10120 • Published • 53 -
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong
Paper • 2501.09775 • Published • 33 -
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario
Paper • 2501.10132 • Published • 22
-
CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference
Paper • 2502.04416 • Published • 12 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 68 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 152 -
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Paper • 2507.21033 • Published • 20
-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 37 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47
-
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU
Paper • 2502.08910 • Published • 148 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 152 -
The Curse of Depth in Large Language Models
Paper • 2502.05795 • Published • 40
-
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 63 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Scaling Laws for Precision
Paper • 2411.04330 • Published • 8 -
Transcending Scaling Laws with 0.1% Extra Compute
Paper • 2210.11399 • Published
-
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model
Paper • 2503.24290 • Published • 62 -
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Paper • 2503.18878 • Published • 119 -
START: Self-taught Reasoner with Tools
Paper • 2503.04625 • Published • 113 -
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Paper • 2503.14476 • Published • 142
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 115 -
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper • 2501.10120 • Published • 53 -
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong
Paper • 2501.09775 • Published • 33 -
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario
Paper • 2501.10132 • Published • 22
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 123 -
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding
Paper • 2502.08946 • Published • 191 -
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
Paper • 2502.07346 • Published • 54 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 152
-
CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference
Paper • 2502.04416 • Published • 12 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 68 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 152 -
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Paper • 2507.21033 • Published • 20