MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents Paper • 2603.09827 • Published 6 days ago • 26
MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents Paper • 2603.09827 • Published 6 days ago • 26
MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models Paper • 2602.17602 • Published 25 days ago • 55
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published Jan 30 • 39
Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models Paper • 2601.19834 • Published Jan 27 • 25
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 229
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 57
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published Dec 2, 2025 • 25