arxiv:2512.16561
Boqiang Zhang
Cyril666
AI & ML interests
Multi-modal
Large Language Models
Vision-Language-Action Models
Recent Activity
authored
a paper
about 18 hours ago
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video
Understanding authored
a paper
about 18 hours ago
What Is a Good Caption? A Comprehensive Visual Caption Benchmark for
Evaluating Both Correctness and Thoroughness authored
a paper
about 18 hours ago
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and
Open Resources