📣 Looking for labeled, high-quality synthetic audio/TTS data 📣 Have you been or are you currently calling API endpoints from OpenAI, ElevenLabs, etc? Do you have labeled audio data sitting around gathering dust? Let's talk! Join https://discord.gg/QuGxSWBfQy or comment down below.

If your data exceeds quantity & quality thresholds and is approved into the next hexgrad/Kokoro-82M training mix, and you permissively DM me the data under an effective Apache license, then I will DM back the corresponding voicepacks for YOUR data if/when the next Apache-licensed Kokoro base model drops.

What does this mean? If you've been calling closed-source TTS or audio API endpoints to:
- Build voice agents
- Make long-form audio, like audiobooks or podcasts
- Handle customer support, etc
Then YOU can contribute to the training mix and get useful artifacts in return. ❤️

More details at hexgrad/Kokoro-82M#21

25 replies

upvoted 3 collections about 1 year ago

reacted to fdaudens's post with 🚀❤️👀 about 1 year ago

Post

2696

Exciting news for audio AI enthusiasts! 🎙️🌍

The Emilia dataset dropped last week, and it's a cool one:
- 101k+ hours of high-quality audio
- 6 languages: 🇨🇳 🇺🇸 🇯🇵 🇰🇷 🇩🇪 🇫🇷
- Diverse content: talk shows, interviews, debates, sports commentary, audiobooks

This dataset could improve multilingual speech generation and recognition. Opens up many possibilities for global media, language learning, and accessibility!

Explore it: amphion/Emilia

#AIAudio

liked a Space about 1 year ago

Stick To Your Role! Leaderboard

🎭

Benchmarking LLMs on the stability of simulated populations

reacted to Undi95's post with 🔥 over 1 year ago

Post

20721

Hello!
The 8B/70B OG Llama-3 models made with the Orthogonal Activation Steering script as been pushed in private.

After multiple test with an empty prompt system, I can confirm it's not uncensored enough, but I wanted to try all the GGUF before (and it take time to do lmao)

If you want to try that yourself, here is the script : https://gist.github.com/wassname/42aba7168bb83e278fcfea87e70fa3af
And here is the same script that we modified to be able to use it on multiple GPU for 70B : https://files.catbox.moe/ya4rto.ipynb

Llama3-Unholy-8B-OAS don't have the problem as it was already trained to be less censored, but the OG one was really too much censored.

I will try to redo that soon, as it seems to HAVE WORKED for some prompt (as seen on the log, for exemple) but it's not enough.

32 entry of the dataset is clearly not enough, but it's okay, I really wanted to try that as it was something new.
I could take the Unholy way and retrain the 70B before using OAS but it should work without, that's not the goal.

61 replies

upvoted a collection over 1 year ago

Yi 1.5 GGUFs

Collection

Collection of Yi 1.5 GGUFs made with gguf-my-repo • 15 items • Updated May 20, 2024 • 5

Nam Nguyen

AI & ML interests

Recent Activity

Organizations

ngphuchoangnam's activity

Higgs Audio Demo

Stick To Your Role! Leaderboard