AudioVisual-Caption/ASID-Captioner-3B Image-Text-to-Text • 5B • Updated 4 days ago • 2.67k • 3
AudioVisual-Caption/ASID-Captioner-3B Image-Text-to-Text • 5B • Updated 4 days ago • 2.67k • 3
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions Paper • 2602.13013 • Published 17 days ago • 8
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions Paper • 2602.13013 • Published 17 days ago • 8
AudioVisual-Caption/ASID-Captioner-3B Image-Text-to-Text • 5B • Updated 4 days ago • 2.67k • 3
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published Oct 9, 2025 • 109
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs Paper • 2509.18056 • Published Sep 22, 2025 • 27
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs Paper • 2509.18056 • Published Sep 22, 2025 • 27