AMoE: Agglomerative Mixture-of-Experts Vision Foundation Model Paper • 2512.20157 • Published Dec 23, 2025 • 1
view article Article Announcing NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models Jul 4, 2025 • 11
Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers Paper • 2601.04890 • Published 30 days ago • 41
view article Article Introducing Falcon-H1-Arabic: Pushing the Boundaries of Arabic Language AI with Hybrid Architecture Jan 5 • 37
InternVL3.5-Core Collection This collection includes only the InternVL3.5 checkpoints that have completed the full training pipeline (i.e., Pretraining, SFT, MPO, Cascade RL). • 30 items • Updated Sep 28, 2025 • 12
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published Jul 30, 2025 • 70
view article Article Announcing NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models Jul 4, 2025 • 11
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 18 items • Updated 2 days ago • 30
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 39 items • Updated 29 days ago • 59
view article Article Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance May 21, 2025 • 39
Running 3.67k The Ultra-Scale Playbook 🌌 3.67k The ultimate guide to training LLM on large GPU Clusters