Provecta Computatione

Share this post

Text-to-speech synthesis

blog.schihei.de

Text-to-speech synthesis

A little Python project

Heiko Joerg Schick
Sep 7, 2022
Share this post

Text-to-speech synthesis

blog.schihei.de
Article voiceover
1×
0:00
-1:39
Audio playback is not supported on your browser. Please upgrade.

By 2026, conversational artificial intelligence deployments within contact centres will reduce agent labour costs by $80 billion, according to Gartner. As digital transformation increases, many jobs are impacted at call centres and in the customer service industry. Gartner projects that one in 10 agent interactions will be automated by 2026. The advantage of conversational artificial intelligence is that it automates all customer interaction—voice and digital channels.

1

An essential and required key technology for conversational agents is text-to-speech (TTS) synthesis—a technology that synthesis the human voice artificially.

I started using such technologies to combine the receptive skills of reading and listening. It helps humans to learn a language better and study more efficient. Because of this, I made a little Python project which converts an input text file into a spoken audio output file—allowing me to generate voice-overs for my blog and op-ed postings.

The project was inspired by Dr Tristan Behrens's

2
arxiv-reader
3
, which converts arXiv papers to audio. It also uses the FastSpeech2 model from fairseq S^2.

The voice-over of this posting is also using this project.

To execute the Python script run the following commands (please note that I use pyenv).

make env
make tts

The project is available via the following repository:

https://gitlab.schihei.de/schihei/tts

1
AI Supremacy
Gartner's Prediction about Call Centers 💁🛒👥
Hey Guys, AI Supremacy is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. On August 31st, 2022 Gartner released a prediction that was turned into articles, and it’s pretty interesting to see what A.I. will be doing to the future of call centers…
Read more
7 months ago · 3 likes · Michael Spencer
2

https://www.linkedin.com/in/dr-tristan-behrens-734967a2/

3

https://github.com/AI-Guru/arxiv-reader

Share this post

Text-to-speech synthesis

blog.schihei.de
Comments
TopNew

No posts

Ready for more?

© 2023 Heiko Joerg Schick
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing