Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published 5 days ago • 45
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published Dec 4, 2025 • 167
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 245
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation Paper • 2511.19365 • Published Nov 24, 2025 • 64
ROOT: Robust Orthogonalized Optimizer for Neural Network Training Paper • 2511.20626 • Published Nov 25, 2025 • 43
SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking Paper • 2511.16618 • Published Nov 20, 2025 • 7
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published Nov 13, 2025 • 96
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27, 2025 • 84
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping Paper • 2510.18927 • Published Oct 21, 2025 • 83
Durian: Dual Reference-guided Portrait Animation with Attribute Transfer Paper • 2509.04434 • Published Sep 4, 2025 • 10
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 47
Runtime error 216 IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System 🎙 216 Generate speech from text using a reference audio
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis Paper • 2508.13618 • Published Aug 19, 2025 • 18
Gaze into the Heart: A Multi-View Video Dataset for rPPG and Health Biomarkers Estimation Paper • 2508.17924 • Published Aug 25, 2025 • 14
MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation Paper • 2508.19320 • Published Aug 26, 2025 • 29