BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights Paper β’ 2501.17790 β’ Published Jan 29, 2025 β’ 3
Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity Paper β’ 2505.11107 β’ Published May 16, 2025 β’ 29
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling Paper β’ 2504.07053 β’ Published Apr 9, 2025 β’ 5
The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities Paper β’ 2501.13921 β’ Published Jan 23, 2025 β’ 3
Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning Paper β’ 2307.10274 β’ Published Jul 18, 2023
Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite Paper β’ 2309.08448 β’ Published Sep 15, 2023
Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition Paper β’ 2405.14259 β’ Published May 23, 2024 β’ 2
Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation Paper β’ 2412.01130 β’ Published Dec 2, 2024 β’ 1