Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation Paper • 2503.19622 • Published Mar 25, 2025 • 31
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis Paper • 2411.07132 • Published Nov 11, 2024
Pixels, Patterns, but No Poetry: To See The World like Humans Paper • 2507.16863 • Published Jul 21, 2025 • 69
Representation Entanglement for Generation:Training Diffusion Transformers Is Much Easier Than You Think Paper • 2507.01467 • Published Jul 2, 2025
How Far Are LLMs from Professional Poker Players? Revisiting Game-Theoretic Reasoning with Agentic Tool Use Paper • 2602.00528 • Published 10 days ago
Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks Paper • 2602.01630 • Published 8 days ago • 47
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published Dec 18, 2025 • 217