PaperBanana: Automating Academic Illustration for AI Scientists Paper β’ 2601.23265 β’ Published 12 days ago β’ 169
Latent Diffusion Model without Variational Autoencoder Paper β’ 2510.15301 β’ Published Oct 17, 2025 β’ 49
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper β’ 2510.15742 β’ Published Oct 17, 2025 β’ 51
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper β’ 2510.05684 β’ Published Oct 7, 2025 β’ 143
Lynx: Towards High-Fidelity Personalized Video Generation Paper β’ 2509.15496 β’ Published Sep 19, 2025 β’ 12
JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching Paper β’ 2506.23552 β’ Published Jun 30, 2025 β’ 10
Seeing Voices: Generating A-Roll Video from Audio with Mirage Paper β’ 2506.08279 β’ Published Jun 9, 2025 β’ 27
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper β’ 2504.00557 β’ Published Apr 1, 2025 β’ 15
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation Paper β’ 2503.09641 β’ Published Mar 12, 2025 β’ 42
SANA-Sprint Collection πSANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation β’ 6 items β’ Updated Sep 13, 2025 β’ 43
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper β’ 2412.17739 β’ Published Dec 23, 2024 β’ 41
FastVLM: Efficient Vision Encoding for Vision Language Models Paper β’ 2412.13303 β’ Published Dec 17, 2024 β’ 73
FashionComposer: Compositional Fashion Image Generation Paper β’ 2412.14168 β’ Published Dec 18, 2024 β’ 17
FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on Paper β’ 2411.10499 β’ Published Nov 15, 2024 β’ 13
RedPajama: an Open Dataset for Training Large Language Models Paper β’ 2411.12372 β’ Published Nov 19, 2024 β’ 56
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality Paper β’ 2410.19355 β’ Published Oct 25, 2024 β’ 24
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper β’ 2411.10440 β’ Published Nov 15, 2024 β’ 129