Hey! In this video, we will talk about Text-to-Speech (TTS) models, and how to fine-tune them for different languages! We will use a transformer-based model, SpeechT5 for fine-tuning.
▬ Contents of this video ▬▬▬▬▬▬▬▬▬▬
0:00 - Intro
0:12 - Basic Theory of Text-to-Speech Models
10:04 - Coding and Fine-tuning the Model
Google Colab training codes:
https://colab.research.google.com/drive/1OYxDK1pZftXG8t0ObOmXSwALOvgfzt0t?usp=sharing
If you prefer GitHub:
https://github.com/emirhanbilgic/Turkish-TTS
The model weights:
https://huggingface.co/emirhanbilgic/speecht5_finetuned_emirhan_tr
The demo:
https://huggingface.co/spaces/emirhanbilgic/Text-to-speech-Turkish
SpeechT5 paper:
https://arxiv.org/abs/2110.07205
Attention is all you need paper:
https://arxiv.org/abs/1706.03762
Good to check:
https://huggingface.co/learn/audio-course/chapter6/fine-tuning
For 1-1 meetings: https://calendly.com/emirhanbilgicai/ai-consulting
#texttospeech #tts #nlp #artificialintelligence #machinelearning