DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels Paper • 2602.11715 • Published 6 days ago • 5 • 3
Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm Paper • 2602.11543 • Published 7 days ago • 4 • 3
NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models Paper • 2602.06694 • Published 12 days ago • 15 • 4
SimpleGPT: Improving GPT via A Simple Normalization Strategy Paper • 2602.01212 • Published 17 days ago • 3 • 4
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer Paper • 2601.14250 • Published 29 days ago • 47 • 5