deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Text Generation • 2B • Updated Feb 24, 2025 • 647k • • 1.46k
FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning Models Paper • 2503.17287 • Published Mar 21, 2025 • 11