Xiangyu Li's picture

Open to Collab

Xiangyu Li

XXXXyu

·

https://xxxxyu.github.io/academic

xxxxyu

AI & ML interests

On-device and physical AI

Recent Activity

authored a paper 1 day ago

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

commentedon a paper 1 day ago

Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

upvoted a paper 1 day ago

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

View all activity

Organizations

None yet

authored a paper 1 day ago

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

Paper • 2603.14371 • Published 3 days ago • 4

commented a paper 1 day ago

Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

Paper • 2512.06443 • Published Dec 6, 2025 • 2 •

upvoted a paper 1 day ago

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

Paper • 2603.14371 • Published 3 days ago • 4

submitted a paper to Daily Papers 1 day ago

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

Paper • 2603.14371 • Published 3 days ago • 4

upvoted a paper 1 day ago

Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

Paper • 2512.06443 • Published Dec 6, 2025 • 2

authored a paper 13 days ago

Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

Paper • 2512.06443 • Published Dec 6, 2025 • 2

upvoted a collection 3 months ago

vlut.cpp

SOTA ternary-packed versions of 1.58-bit LLMs for efficient on-device inference with vlut.cpp. • 3 items • Updated Jan 1 • 1

updated a collection 3 months ago

vlut.cpp

SOTA ternary-packed versions of 1.58-bit LLMs for efficient on-device inference with vlut.cpp. • 3 items • Updated Jan 1 • 1

updated 4 models 3 months ago

XXXXyu/Llama3-8B-1.58-100B-tokens-vlut-gguf

Text Generation • 8B • Updated Jan 1 • 28

XXXXyu/bitnet_b1_58-3B-vlut-gguf

Text Generation • 3B • Updated Jan 1 • 41

XXXXyu/Falcon3-1B-Instruct-1.58bit-vlut-gguf

Text Generation • 2B • Updated Jan 1 • 48

XXXXyu/Qwen3-1.7B-w2g64-gptq_v2

2B • Updated Dec 30, 2025 • 12.4k

published 3 models 3 months ago

XXXXyu/Llama3-8B-1.58-100B-tokens-vlut-gguf

Text Generation • 8B • Updated Jan 1 • 28

XXXXyu/bitnet_b1_58-3B-vlut-gguf

Text Generation • 3B • Updated Jan 1 • 41

XXXXyu/Falcon3-1B-Instruct-1.58bit-vlut-gguf

Text Generation • 2B • Updated Jan 1 • 48

upvoted a paper 4 months ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Paper • 2511.08577 • Published Nov 11, 2025 • 109

published a model 5 months ago

XXXXyu/Qwen3-1.7B-w2g64-gptq_v2

2B • Updated Dec 30, 2025 • 12.4k

liked a model about 1 year ago

1bitLLM/bitnet_b1_58-3B

Text Generation • 3B • Updated Mar 29, 2024 • 1.19k • 261