LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling Paper • 2511.20785 • Published 14 days ago • 150
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing Paper • 2509.26346 • Published Sep 30 • 18
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published 19 days ago • 91
A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports Paper • 2510.02190 • Published Oct 2 • 18
Modeling All-Atom Glycan Structures via Hierarchical Message Passing and Multi-Scale Pre-training Paper • 2506.01376 • Published Jun 2
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26 • 25
PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models Paper • 2505.22523 • Published May 28 • 7
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper • 2503.20672 • Published Mar 26 • 14