19 47 58

xiangan

https://anxiangsir.github.io/

anxiangsir

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

LLaDA-o: An Effective and Length-Adaptive Omni Diffusion Model

upvoted a paper 4 days ago

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

upvoted an article 4 days ago

NEO-unify: Building Native Multimodal Unified Models End to End

View all activity

Organizations

upvoted 2 papers 4 days ago

LLaDA-o: An Effective and Length-Adaptive Omni Diffusion Model

Paper • 2603.01068 • Published 9 days ago • 19

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Paper • 2510.14979 • Published Oct 16, 2025 • 68

upvoted an article 4 days ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

4 days ago

•

updated a dataset 6 days ago

lmms-lab-encoder/Molmo2-VideoPointEval

Updated 6 days ago • 9

published a dataset 6 days ago

lmms-lab-encoder/Molmo2-VideoPointEval

Updated 6 days ago • 9

upvoted a paper 6 days ago

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Paper • 2603.03241 • Published 6 days ago • 80

liked a model 7 days ago

Qwen/Qwen3.5-4B

Image-Text-to-Text • 5B • Updated 8 days ago • 438k • 316

upvoted a changelog 8 days ago

Hugging Face Changelog

Public Storage Add-ons

11 days ago

• 136

upvoted a collection 17 days ago

onevision-encoder

Collection

2 items • Updated 28 days ago • 6

published 2 datasets 18 days ago

lmms-lab-encoder/60s_tem_grounding_ov2_codec_100k

Updated 17 days ago • 121

lmms-lab-encoder/60s_20260215_154644_ov2_codec_1w

Updated 18 days ago • 11

upvoted a paper 20 days ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published 25 days ago • 20

authored a paper 20 days ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published 28 days ago • 50

upvoted a paper 21 days ago

CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

Paper • 2602.13191 • Published 24 days ago • 30

updated a collection 22 days ago

OneVision-Encoder

Collection

2 items • Updated 22 days ago

upvoted a paper 22 days ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published 28 days ago • 50

updated a dataset 23 days ago

lmms-lab-encoder/wd_temporal_grounding_frames_max_64_max_448x448_pixels_with_fps

Updated 23 days ago • 157

published a dataset 23 days ago

lmms-lab-encoder/wd_temporal_grounding_frames_max_64_max_448x448_pixels_with_fps

Updated 23 days ago • 157

upvoted a paper 25 days ago

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published 25 days ago • 57

authored a paper 27 days ago

ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder

Paper • 2510.18795 • Published Oct 21, 2025 • 11

xiangan

AI & ML interests

Recent Activity

Organizations

xiangan's activity

NEO-unify: Building Native Multimodal Unified Models End to End

Public Storage Add-ons