Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs Paper • 2602.10388 • Published 10 days ago • 219
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 67 items • Updated 3 days ago • 8
Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation Paper • 2512.20908 • Published Dec 24, 2025 • 29
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 16 days ago • 320
MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents Paper • 2601.03236 • Published Jan 6 • 5
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 226
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 22 days ago • 154
TranslateGemma VLLM Collection Modified version of google/translategemma-4/12/27b-it optimized for deployment with vLLM. • 4 items • Updated 6 days ago • 1
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning Paper • 2601.09088 • Published Jan 14 • 63
Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated Oct 1, 2025 • 331
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published Dec 18, 2025 • 219
view article Article The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator Dec 17, 2025 • 47
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance Dec 9, 2025 • 84