Avi Basa's picture

321

Avi Basa

avahal

·

AI & ML interests

None yet

Recent Activity

commented on a paper 12 minutes ago

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

commented on a paper about 9 hours ago

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

commented on a paper about 9 hours ago

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

View all activity

Organizations

None yet

commented a paper 12 minutes ago

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published 6 days ago • 37 •

commented 19 papers about 9 hours ago

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Paper • 2512.23709 • Published 4 days ago • 36 •

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Paper • 2512.23705 • Published 4 days ago • 38 •

Yume-1.5: A Text-Controlled Interactive World Generation Model

Paper • 2512.22096 • Published 7 days ago • 55 •

LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published 4 days ago • 61 •

Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published 4 days ago • 85 •

A 58-Addition, Rank-23 Scheme for General 3x3 Matrix Multiplication

Paper • 2512.21980 • Published 7 days ago • 2 •

Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards

Paper • 2512.21625 • Published 8 days ago • 3 •

SVBench: Evaluation of Video Generation Models on Social Reasoning

Paper • 2512.21507 • Published 8 days ago • 7 •

SlideTailor: Personalized Presentation Slide Generation for Scientific Papers

Paper • 2512.20292 • Published 10 days ago • 8 •

SWE-RM: Execution-free Feedback For Software Engineering Agents

Paper • 2512.21919 • Published 7 days ago • 8 •

InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search

Paper • 2512.18745 • Published 12 days ago • 10 •

Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding

Paper • 2512.21643 • Published 8 days ago • 10 •

See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning

Paper • 2512.22120 • Published 7 days ago • 12 •

ProEdit: Inversion-based Editing From Prompts Done Right

Paper • 2512.22118 • Published 7 days ago • 16 •

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models

Paper • 2512.22238 • Published 10 days ago • 17 •

TimeBill: Time-Budgeted Inference for Large Language Models

Paper • 2512.21859 • Published 7 days ago • 18 •

UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

Paper • 2512.21675 • Published 8 days ago • 24 •

MAI-UI Technical Report: Real-World Centric Foundation GUI Agents

Paper • 2512.22047 • Published 7 days ago • 25 •

Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding

Paper • 2512.17220 • Published 14 days ago • 89 •