5 18

Haocheng Xi

Xihc20

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

new activity 2 months ago

xihc-ucb/Qwen2.5-7B-train-Quasar-1002:Upload FP8Qwen2ForCausalLM

new activity 2 months ago

xihc-ucb/Qwen2.5-7B-Instruct-train-Quasar-1002:Upload FP8Qwen2ForCausalLM

View all activity

Organizations

upvoted a paper 2 days ago

Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Paper • 2602.02958 • Published 5 days ago • 32

New activity in xihc-ucb/Qwen2.5-7B-train-Quasar-1002 2 months ago

Upload FP8Qwen2ForCausalLM

#7 opened 2 months ago by

Xihc20

New activity in xihc-ucb/Qwen2.5-7B-Instruct-train-Quasar-1002 2 months ago

Upload FP8Qwen2ForCausalLM

#9 opened 2 months ago by

Xihc20

Upload FP8Qwen2ForCausalLM

#8 opened 2 months ago by

Xihc20

upvoted a paper 4 months ago

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28, 2025 • 118

upvoted 5 papers 9 months ago

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Paper • 2505.18875 • Published May 24, 2025 • 42

GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning

Paper • 2505.17022 • Published May 22, 2025 • 27

Thinkless: LLM Learns When to Think

Paper • 2505.13379 • Published May 19, 2025 • 50

Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction

Paper • 2505.11254 • Published May 16, 2025 • 48

AdaptThink: Reasoning Models Can Learn When to Think

Paper • 2505.13417 • Published May 19, 2025 • 83

upvoted a paper 10 months ago

Learning Adaptive Parallel Reasoning with Language Models

Paper • 2504.15466 • Published Apr 21, 2025 • 44

updated a dataset 11 months ago

Efficient-Large-Model/COAT-ToolBench

Updated Mar 26, 2025 • 32

published a dataset 11 months ago

Efficient-Large-Model/COAT-ToolBench

Updated Mar 26, 2025 • 32

published a Space 11 months ago

Sparse VideoGen

📈

Demos

upvoted 2 papers 12 months ago

S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published Feb 20, 2025 • 63

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16, 2025 • 167

upvoted 2 papers about 1 year ago

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Paper • 2411.10958 • Published Nov 17, 2024 • 57

NVILA: Efficient Frontier Visual Language Models

Paper • 2412.04468 • Published Dec 5, 2024 • 59

updated a dataset about 1 year ago

Xihc20/CropVBench

Preview • Updated Dec 3, 2024 • 13

upvoted an article about 1 year ago

Article

Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique

Nov 30, 2023

•

Haocheng Xi

AI & ML interests

Recent Activity

Organizations

Xihc20's activity

Upload FP8Qwen2ForCausalLM

Upload FP8Qwen2ForCausalLM

Upload FP8Qwen2ForCausalLM

Sparse VideoGen

Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique