Quan-Sheng Zeng (SII)'s picture

Quan-Sheng Zeng (SII)

ashun989

·

AI & ML interests

None yet

Recent Activity

updated a model 17 days ago

ashun989/GlimpsePrune-Plus_Qwen2.5-VL-7B-Instruct

upvoted a paper 19 days ago

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

upvoted a paper 20 days ago

LLaDA2.1: Speeding Up Text Diffusion via Token Editing

View all activity

Organizations

None yet

upvoted a paper 19 days ago

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

Paper • 2602.07026 • Published 27 days ago • 137

upvoted 3 papers 20 days ago

LLaDA2.1: Speeding Up Text Diffusion via Token Editing

Paper • 2602.08676 • Published 20 days ago • 68

MOVA: Towards Scalable and Synchronized Video-Audio Generation

Paper • 2602.08794 • Published 20 days ago • 154

Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory

Paper • 2602.02393 • Published 27 days ago • 16

upvoted a paper 2 months ago

UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

Paper • 2512.21675 • Published Dec 25, 2025 • 25

upvoted 2 papers 4 months ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 240

Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Paper • 2510.13554 • Published Oct 15, 2025 • 58

upvoted 2 papers 5 months ago

Democratizing AI scientists using ToolUniverse

Paper • 2509.23426 • Published Sep 27, 2025 • 40

TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

Paper • 2509.18056 • Published Sep 22, 2025 • 27

upvoted a paper 7 months ago

A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models

Paper • 2508.01548 • Published Aug 3, 2025 • 14

upvoted a collection 7 months ago

GlimpsePrune

A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models. https://github.com/HVision-NKU/GlimpsePrune • 6 items • Updated Aug 5, 2025 • 1

upvoted a paper 7 months ago

Gaussian Splatting with Discretized SDF for Relightable Assets

Paper • 2507.15629 • Published Jul 21, 2025 • 23

upvoted a paper 8 months ago

LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs

Paper • 2506.21862 • Published Jun 27, 2025 • 36