OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism Paper • 2603.14371 • Published 3 days ago • 4
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices Paper • 2512.06443 • Published Dec 6, 2025 • 2 • 1
OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism Paper • 2603.14371 • Published 3 days ago • 4
OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism Paper • 2603.14371 • Published 3 days ago • 4
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices Paper • 2512.06443 • Published Dec 6, 2025 • 2
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices Paper • 2512.06443 • Published Dec 6, 2025 • 2
vlut.cpp Collection SOTA ternary-packed versions of 1.58-bit LLMs for efficient on-device inference with vlut.cpp. • 3 items • Updated Jan 1 • 1
vlut.cpp Collection SOTA ternary-packed versions of 1.58-bit LLMs for efficient on-device inference with vlut.cpp. • 3 items • Updated Jan 1 • 1
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published Nov 11, 2025 • 109