What is it?
With the text-to-speech API, developers can generate high quality spoken audio from text. We鈥檙e initially offering six preset voices to choose from and two model variants, tts-1
and tts-1-hd
. tts-1
is optimized for real-time use cases and tts-1-hd
is optimized for quality. Pricing starts at $0.015 per 1,000 input characters (not tokens).
How can I use it?
Anyone with an OpenAI API account can access the new audio/speech
endpoint.
What rate limits can I expect?
Rate limits begin at 50 RPM for paid accounts. You can see your limits in your developer console.
What鈥檚 the maximum input size I can submit per request?
4096 characters (equivalent to ~5 minutes of audio at default speed).
Is it possible to stream audio?
Yes! By setting stream=True
, you can chunk the returned audio file.