The basics of our text-to-speech API

Updated over a week ago

What is it?

With the new text-to-speech API, developers can generate high quality spoken audio from text. We鈥檙e initially offering six preset voices to choose from and two model variants, tts-1 and tts-1-hd. tts-1 is optimized for real-time use cases and tts-1-hd is optimized for quality. Pricing starts at $0.015 per 1,000 input characters (not tokens).

How can I use it?

Anyone with an OpenAI API account can access the new audio/speech endpoint.

What rate limits can I expect?

Rate limits begin at 50 RPM for paid accounts. You can see your limits in your developer console.

What鈥檚 the maximum input size I can submit per request?

4096 characters (equivalent to ~5 minutes of audio at default speed).

Is it possible to stream audio?

Yes! By setting stream=True, you can chunk the returned audio file.

Did this answer your question?