Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop Paper β’ 2506.10968 β’ Published Jun 12, 2025 β’ 1
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics Paper β’ 2602.12617 β’ Published 19 days ago β’ 20
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper β’ 2601.08763 β’ Published Jan 13 β’ 148
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Paper β’ 2505.24298 β’ Published May 30, 2025 β’ 29
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs Dec 4, 2024 β’ 80
Skywork-o1-Open Collection Skywork o1 open model collections β’ 3 items β’ Updated Jun 12, 2025 β’ 22
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. β’ 6 items β’ Updated about 18 hours ago β’ 155