MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation Paper • 2603.16861 • Published 6 days ago • 4
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning Paper • 2602.07845 • Published Feb 8 • 70
FailSafe: Reasoning and Recovery from Failures in Vision-Language-Action Models Paper • 2510.01642 • Published Oct 2, 2025 • 1
TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics Paper • 2602.19313 • Published 29 days ago • 24
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11, 2025 • 44
MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation Paper • 2602.11337 • Published Feb 11 • 6
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning Paper • 2602.07845 • Published Feb 8 • 70
VLS: Steering Pretrained Robot Policies via Vision-Language Models Paper • 2602.03973 • Published Feb 3 • 22
VLS: Steering Pretrained Robot Policies via Vision-Language Models Paper • 2602.03973 • Published Feb 3 • 22
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11, 2025 • 44
SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation Paper • 2501.18564 • Published Jan 30, 2025 • 2
PointArena: Probing Multimodal Grounding Through Language-Guided Pointing Paper • 2505.09990 • Published May 15, 2025 • 12
Selective Visual Representations Improve Convergence and Generalization for Embodied AI Paper • 2311.04193 • Published Nov 7, 2023
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics Paper • 2406.10721 • Published Jun 15, 2024 • 2
SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models Paper • 2412.07755 • Published Dec 10, 2024 • 2
SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation Paper • 2501.18564 • Published Jan 30, 2025 • 2
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation Paper • 2402.08191 • Published Feb 13, 2024
From Mystery to Mastery: Failure Diagnosis for Improving Manipulation Policies Paper • 2412.02818 • Published Dec 3, 2024 • 1
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation Paper • 2410.00371 • Published Oct 1, 2024