Reasoning-Benchmarks A collection of mutiple benchmarks for large reasoning model evaluation guanning/amc23 Viewer • Updated May 25 • 40 • 15 guanning/math Viewer • Updated Jun 12 • 12.5k • 22 guanning/aime24 Viewer • Updated May 25 • 30 • 18 guanning/aime25 Viewer • Updated May 25 • 30 • 13
Reasoning-Benchmarks A collection of mutiple benchmarks for large reasoning model evaluation guanning/amc23 Viewer • Updated May 25 • 40 • 15 guanning/math Viewer • Updated Jun 12 • 12.5k • 22 guanning/aime24 Viewer • Updated May 25 • 30 • 18 guanning/aime25 Viewer • Updated May 25 • 30 • 13