SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 7 days ago • 17
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 7 days ago • 17
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 7 days ago • 46
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 7 days ago • 46
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 7 days ago • 46
Geometry-Aware Rotary Position Embedding for Consistent Video World Model Paper • 2602.07854 • Published 12 days ago • 6
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper • 2602.02958 • Published 18 days ago • 33
World Simulation with Video Foundation Models for Physical AI Paper • 2511.00062 • Published Oct 28, 2025 • 44
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95