Skip to content

CLOUD TEXT2SPECH SYNTHESIS

CLOUD TEXT2SPECH SYNTHESIS

Lifelike Text to Speech for your customers

Make your products more engaging with our voice solutions. Our text-to-speech solutions provide improved digital accessibility to populations with learning and speech disabilities, visual impairments, and low literacy challenges across devices and platforms, so your organization’s message is clear and comprehensible to everyone. 

Powered by Prolope machine learning

Apply advanced deep learning neural network algorithms to synthesize text into a variety of voices and languages. Our neural networks were built on our experience in comics and speech synthesis of our partners.

Select from 100+ voices

Cloud Text-to-Speech offers a selection of 100+ voices across 20+ languages and variants, enabling clients to pick the voice that works best for their application.

High-fidelity speech synthesis

Cloud Text-to-Speech converts text into human-like speech in more than 100 voices across 20+ languages and variants. It applies groundbreaking research in speech synthesis and powerful neural networks to deliver high-fidelity audio. With this easy-to-use service, you can create lifelike interactions with your users that transform customer service, device interaction, and other applications.

Cloud Text-to-Speech features

Multilingual

Supports 100+ voices across 20+ languages and variants, with more to come soon.

Text and SSML Support

Customize your speech with SSML tags that allow you to add pauses, numbers, date and time formatting, and other pronunciation instructions.

Speaking Rate Tuning

Customize your speaking rate to be 10x faster or slower than the normal rate.

Pitch Tuning

Customize the pitch of your selected voice, up to 20 semitones more or less than the default output.

Volume Gain Control

Increase the volume of the output by up to 16db or decrease the volume up to -96db.

Audio Format Flexibility

Choose from a number of audio formats including mp3, Linear16, and Ogg Opus.

Audio Profiles

Optimize for the type of speaker from which your speech is intended to play, such as headphones or phone lines.