DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation Paper โข 2511.06307 โข Published 28 days ago โข 50
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper โข 2408.03314 โข Published Aug 6, 2024 โข 63
๐ Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized โข 134 items โข Updated Oct 20 โข 116