Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper ⢠2511.13254 ⢠Published Nov 17, 2025 ⢠136
TiDAR: Think in Diffusion, Talk in Autoregression Paper ⢠2511.08923 ⢠Published Nov 12, 2025 ⢠126
Runtime error Featured 2.95k The Smol Training Playbook š 2.95k The secrets to building world-class LLMs
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method ⢠26 items ⢠Updated 8 days ago ⢠99
cerebras/GLM-4.5-Air-REAP-82B-A12B Text Generation ⢠82B ⢠Updated Oct 21, 2025 ⢠11.5k ⢠108