Text to Speech

The Text to Speech service understands text and natural language to generate synthesized audio output complete with appropriate cadence and intonation. It is available in 12 voices across 7 languages. Select voices now offer Expressive Synthesis.


Input Text

The text language must match the selected voice language: Mixing language (English text with a Spanish male voice) does not produce valid results. The synthesized audio is streamed to the client as it is being produced, using the HTTP chunked encoding. The audio is returned in the Ogg Opus format which can be played using VLC and Audacity players.

Would you like to help make this service better?

Allow Watson to learn from this session
Opt out