Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.20822

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published Oct 23 • 40

Video Generation

VISTA: A Test-Time Self-Improving Video Generation Agent

Paper • 2510.15831 • Published Oct 17 • 20
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published Oct 23 • 40
Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published Oct 23 • 45

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published Oct 23 • 40

HoloScene: Simulation-Ready Interactive 3D Worlds from a Single Video

Paper • 2510.05560 • Published Oct 7 • 7
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning

Paper • 2510.06217 • Published Oct 7 • 63
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 495
Fast-dLLM v2: Efficient Block-Diffusion LLM

Paper • 2509.26328 • Published Sep 30 • 54

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published Oct 23 • 40

HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published Oct 23 • 40

Video Generation

VISTA: A Test-Time Self-Improving Video Generation Agent

Paper • 2510.15831 • Published Oct 17 • 20
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Paper • 2510.20822 • Published Oct 23 • 40
Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published Oct 23 • 45

HoloScene: Simulation-Ready Interactive 3D Worlds from a Single Video

Paper • 2510.05560 • Published Oct 7 • 7
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning

Paper • 2510.06217 • Published Oct 7 • 63
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 495
Fast-dLLM v2: Efficient Block-Diffusion LLM

Paper • 2509.26328 • Published Sep 30 • 54

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs