MerlinLi
's Collections
text-to-speech
updated
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper
•
2404.14700
•
Published
•
32
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Paper
•
2306.15687
•
Published
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and
Diffusion Models
Paper
•
2403.03100
•
Published
•
38
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through
Direct Preference Optimization
Paper
•
2404.09956
•
Published
•
12
Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech
Prompts
Paper
•
2307.07218
•
Published
•
27
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive
Bias
Paper
•
2306.03509
•
Published
•
5
parler-tts/dac_44khZ_8kbps
76.7M
•
Updated
•
1.43k
•
19
parler-tts/parler_tts_mini_v0.1
Text-to-Speech
•
0.6B
•
Updated
•
3.22k
•
358
Wenetspeech4TTS/WenetSpeech4TTS
Updated
•
1.35k
•
82
liuhuadai/AudioLCM
Text-to-Audio
•
Updated
•
8
•
9
kyutai/mimi
Feature Extraction
•
96.2M
•
Updated
•
410k
•
•
270
hexgrad/Kokoro-82M
Text-to-Speech
•
Updated
•
4.13M
•
•
5.38k
HKUSTAudio/Llasa-3B
Text-to-Speech
•
4B
•
Updated
•
1.71k
•
522
Zyphra/Zonos-v0.1-hybrid
Text-to-Speech
•
Updated
•
42k
•
1.1k
stepfun-ai/Step-Audio-TTS-3B
Text-to-Speech
•
4B
•
Updated
•
192
•
192
ByteDance/MegaTTS3
Text-to-Speech
•
Updated
•
181
•
412
nari-labs/Dia-1.6B
Text-to-Speech
•
Updated
•
162k
•
•
2.81k