SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Paper • 2602.21818 • Published 8 days ago • 52
Qwen3.5 Collection Qwen3.5 is Qwen's new model family including Qwen3.5 Small: 0.8B, 2B, 4B, 9B and Qwen3.5 Medium: 35B-A3B, 27B, 122B-A10B and 397B-A17B. • 25 items • Updated 3 days ago • 84
BitDance Collection BitDance: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model. • 10 items • Updated 3 days ago • 11
NEST-Ja Collection Japanese speech self-supervised learning model developed by SB Intuitions. • 2 items • Updated 23 days ago • 1
Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning Paper • 2601.03872 • Published Jan 7 • 43
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published Dec 26, 2025 • 60
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 87
DeContext as Defense: Safe Image Editing in Diffusion Transformers Paper • 2512.16625 • Published Dec 18, 2025 • 25
IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning Paper • 2512.15635 • Published Dec 17, 2025 • 20
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published Dec 4, 2025 • 173