SSR Collection Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning • 6 items • Updated Jul 7, 2025 • 2
Sketch-in-Latents: Eliciting Unified Reasoning in MLLMs Paper • 2512.16584 • Published 22 days ago • 1
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning Paper • 2508.21113 • Published Aug 28, 2025 • 110
VisionThink Collection Efficient Reasoning Vision Language Model • 7 items • Updated Jul 18, 2025 • 7
MiniCPM-o & MiniCPM-V Collection Multimodal models with leading performance. • 28 items • Updated Sep 1, 2025 • 59
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published Jul 1, 2024 • 81
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking Paper • 2502.02339 • Published Feb 4, 2025 • 23
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 9 days ago • 550
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated 9 days ago • 227