facebook/webssl-dino300m-full2b-224 Image Feature Extraction • 0.3B • Updated Apr 24, 2025 • 4.14k • 11
Scale RAE Collection Collection for "Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders" • 7 items • Updated 12 days ago • 3
RAE Collection Collection for Diffusion Transformers with Representation Autoencoders • 1 item • Updated Oct 14, 2025 • 11
OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows Paper • 2510.03506 • Published Oct 3, 2025 • 15
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training Paper • 2509.26625 • Published Sep 30, 2025 • 43
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13, 2025 • 191
Cosmos-Tokenize1 Collection A suite of image and video tokenizers • 9 items • Updated 10 days ago • 9