UniAudio 2.0: A Unified Audio Language Model with Text-Aligned Factorized Audio Tokenization Paper • 2602.04683 • Published 7 days ago • 2
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 7 days ago • 284
AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research Paper • 2602.06540 • Published 6 days ago • 20
MiniCPM-o & MiniCPM-V Collection Multimodal models with leading performance. • 31 items • Updated 3 days ago • 64
OpenBEATs Collection Checkpoints for the WASPAA 2025 paper "OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder" • 93 items • Updated 17 days ago • 5
mistralai/Voxtral-Mini-4B-Realtime-2602 Automatic Speech Recognition • Updated 1 day ago • 3.2k • 476
Nemotron Speech Collection Open, state-of-the-art, production‑ready enterprise speech models from the NVIDIA Speech research team for ASR, TTS, Speaker Diarization and S2S • 9 items • Updated 7 days ago • 37