The Audio API supports two speech to text endpoints:
transcriptions
transalations
To get started with the Audio API, please read our speech to text developer documentation.
How much does the Audio API cost to use?
See our pricing page for details.
What languages are supported?
View a list of supported languages here.
How can we handle large audio files?
The maximum file size for the Audio API is 25MB. If you expect users to upload audio files larger than 25MB, you can follow our documentation for handling long audio inputs from users.
What streaming methods are available?
There are two ways you can stream your transcription depending on your use case and whether you are trying to transcribe an already completed audio recording or handle an ongoing stream of audio and use OpenAI for turn detection:
Note that streaming is not supported with the whisper-1
model.
What file formats are supported?
The supported file formats are included in our API docs.
Can I send links to audio files to the Audio API?
No, you must send a file in one of the supported audio formats.