Glossary · Term

TTS

Also known as: speech synthesis, text to speech

TTS is a technology that reads text in a natural voice. It is the foundation of AI dubbing and audiobooks.

TTS (Text to Speech) is a technology that converts written sentences into a natural human voice and reads them. It is also called speech synthesis. Unlike the rigid mechanical sound announcements of the past, recent TTS has reached a level where it is difficult to distinguish it from a voice actor's reading by reproducing intonation, emotion, and even breathing sounds.

Starting as a means of accessing information in situations where the screen cannot be seen or for the visually impaired, its use has now expanded to include navigation guidance, audiobooks, video dubbing, and AI call assistants. In particular, the addition of voice cloning technology, which reproduces a specific person's voice with a few seconds of sample, is changing the way content is produced.

However, as voice cloning has become easier, concerns about voice phishing and deepfake voice abuse have increased, and the rights of voice professions such as voice actors are also becoming a new issue.

✅ Why it matters

⚠️ Limits and debates

← View all glossary entries