AgentIR: Reasoning-Aware Retrieval for Deep Research Agents Paper • 2603.04384 • Published 6 days ago • 1
NOBLE: Accelerating Transformers with Nonlinear Low-Rank Branches Paper • 2603.06492 • Published 4 days ago • 1
Progressive Residual Warmup for Language Model Pretraining Paper • 2603.05369 • Published 5 days ago • 30
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning Paper • 2603.03790 • Published 6 days ago • 112
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory Paper • 2603.04257 • Published 6 days ago • 18
If You Want Coherence, Orchestrate a Team of Rivals: Multi-Agent Models of Organizational Intelligence Paper • 2601.14351 • Published Jan 20 • 1
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare Paper • 2602.06717 • Published Feb 6 • 72
POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration Paper • 2601.18779 • Published Jan 26 • 1
VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning Paper • 2601.20055 • Published Jan 27 • 7
Mechanistic Data Attribution: Tracing the Training Origins of Interpretable LLM Units Paper • 2601.21996 • Published Jan 29 • 5
MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models Paper • 2601.21181 • Published Jan 29 • 9
Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling Paper • 2601.22636 • Published Jan 30 • 22
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published Jan 30 • 39
SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback Paper • 2601.18202 • Published Jan 26 • 9
OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution Paper • 2601.20380 • Published Jan 28 • 9
Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning Paper • 2601.20209 • Published Jan 28 • 23