DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 15 days ago • 54
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published 19 days ago • 91
OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation Paper • 2511.13655 • Published 22 days ago • 9
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published 28 days ago • 194
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published 28 days ago • 75
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11 • 44
UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning Paper • 2510.20286 • Published Oct 23 • 23
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark Paper • 2510.26802 • Published Oct 30 • 33
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30 • 117
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published Oct 6 • 116
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 56
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining Paper • 2508.10975 • Published Aug 14 • 60
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality Paper • 2507.07202 • Published Jul 9 • 24
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 242