RubricBench: Aligning Model-Generated Rubrics with Human Standards Paper • 2603.01562 • Published 5 days ago • 51
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 4 days ago • 69
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published 8 days ago • 76
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published Jan 28 • 182
Running on CPU Upgrade 588 GAIA Leaderboard 🦾 588 Submit your model answers to GAIA benchmark and view leaderboard
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation Paper • 2512.09363 • Published Dec 10, 2025 • 72
Multimodal Evaluation of Russian-language Architectures Paper • 2511.15552 • Published Nov 19, 2025 • 79