Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 29 days ago • 105
view post Post 164 ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration (2511.21689) See translation 👀 1 1 + Reply
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments Paper • 2511.07317 • Published Nov 10, 2025 • 17
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30, 2025 • 144
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29, 2025 • 98
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees Paper • 2503.08893 • Published Mar 11, 2025 • 6
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning Paper • 2310.06694 • Published Oct 10, 2023 • 3
Evaluating Large Language Models at Evaluating Instruction Following Paper • 2310.07641 • Published Oct 11, 2023 • 1
Plug-and-Play Knowledge Injection for Pre-trained Language Models Paper • 2305.17691 • Published May 28, 2023 • 1