Videos
Hey everyone!
We have been exploring various open-source Text-to-Speech (TTS) models, and decided to create a Hugging Face demo space that makes it easy to compare their quality side-by-side.
The demo features 12 popular TTS models, all tested using a consistent prompt, so you can quickly hear and compare their synthesized speech and choose the best one for your audio projects.
Would love to get feedback or suggestions!
👉 Check out the demo space and detailed comparison here!
👉 Check out the blog: Choosing the Right Text-to-Speech Model: Part 2
Share your use-case and we will update this space as required!
Which TTS model sounds most natural to you?
Cheers!
Hi 👋 I’m researching the best speech to text model that can support multiple language to auto detect language for real time streaming.
I’m really struggling to find the right platform or service. Deepgram has done solid SEO so keep getting articles which says it’s better but also it doesn’t support auto detect language for real time streaming! Has anyone used google speech to text or any other service that supports this? Or any open source model? Thanks so much
Hi everyone, I'm VB, the GPU poor in residence (focus on open source audio and on-device ML) at Hugging Face! 🤗
Quite please to introduce you to Parler TTS v1 🔉 - 885M (Mini) & 2.2B (Large) - fully open-source Text-to-Speech models! 🤙
Some interesting things about it:
Trained on 45,000 hours of open speech (datasets released as well)
Upto 4x faster generation thanks to torch compile & static KV cache (compared to previous v0.1 release)
Mini trained on a larger text encoder, large trained on both larger text & decoder
Also supports SDPA & Flash Attention 2 for an added speed boost
In-built streaming, we provide a dedicated streaming class optimised for time to the first audio
Better speaker consistency, more than a dozen speakers to choose from or create a speaker description prompt and use that
Not convinced with a speaker? You can fine-tune the model on your dataset (only couple of hours would do)
Apache 2.0 licensed codebase, weights and datasets! 🤗
Can't wait to see what y'all would build with this!🫡
Quick links:
Model checkpoints: https://huggingface.co/collections/parler-tts/parler-tts-fully-open-source-high-quality-tts-66164ad285ba03e8ffde214c
Space: https://huggingface.co/spaces/parler-tts/parler_tts
GitHub Repo: https://github.com/huggingface/parler-tts