All Collections
Voice conversations (beta) - FAQ
Voice conversations (beta) - FAQ

Your guide to voice conversations with ChatGPT, from setting up and using the feature to understanding its capabilities and limitations.

Johanna C. avatar
Written by Johanna C.
Updated this week

This feature is in beta.

We plan to incorporate your feedback and refine the most popular use-cases before launching these features to all users. We look forward to learning how you'll use your new and improved assistant!

What are voice conversations?

Try a new way of interacting with ChatGPT: talk, don’t type – and it’ll respond in a natural voice.

Our voice capability is powered by our models, including Whisper, our open source speech-to-text model, and a new text-to-speech model.

Enable Voice conversations to engage in a back-and-forth voice conversations with ChatGPT.

Which plan types can have voice conversations?

All users on Plus and ChatGPT Enterprise plans.

Which apps can have voice conversations?

Voice conversations are available on the ChatGPT mobile apps for both iOS and Android.

How many voice options are available?

Choose from five lifelike output voices for ChatGPT, each with its own distinct tone and character, including: Breeze, Cove, Juniper, Sky, and Ember.

Do GPTs use one of the five options of voices in ChatGPT?

No, GPTs have their own voice option named, Shimmer, that is distinctly different from the 5 output voices available to use when having voice conversations with ChatGPT.

Which models can I use in voice conversations?

GPT 3.5 and GPT 4 are available for use in voice conversations.

Keep in mind for Plus users GPT-4 has a cap of 50 messages every 3 hours. For users on the Enterprise plan there is no message cap.

Is there a volume limit I can set for voice conversations?

No, there is not a volume limit for voice conversations as a setting in ChatGPT. Volume will be set on the device itself.

Can I use ChatGPT vision capabilities and voice conversations in the same conversation?

Yes, you can start a voice conversation in a chat using vision capabilities just like you can start a voice conversation in conversations using GPT 3.5 or GPT 4.

Why is Chat History & Training required to be turned ON to use voice conversations?

For users on the Plus plan, Chat History & Training from Data Controls must be enabled to be able to have voice conversations so you can review the transcript of your conversations. Plus users can still opt-out of having their ChatGPT data used to improve our models by submitting this form.

Learn more about How your data is used to improve model performance to understand how we use content, including transcriptions of your voice chats, to improve our services and your choices.

Note: the requirement to enable Chat History & Training applies only to users on Plus and does not apply to users on an Enterprise plan as customer prompts or data are not used for training models on the Enterprise plan.

Why does the banner include thumbs up / down rating after my voice conversation has ended?

All users having voice conversations (Plus and Enterprise; iOS + Android) will see a banner after their voice conversation has ended.

This feedback survey collects information on the experience of the voice call, not about the conversation or its contents.

Only users on Plus will see the options to rate with the thumbs up/down included in that banner.

While Enterprise users will see the banner about the voice conversation ending their banner should not include the rating options thumbs up / down.

Do you save my audio when I use voice conversations?

No, during our beta, audio clips from voice conversations are not saved. We send audio clips to our Whisper API to transcribe them, but they are not retained after processing.

You can find the text transcriptions from your voice conversations in your ChatGPT conversation history.

Do you train your models on audio clips from voice conversations?

No, during our beta, we only use the audio clips to prepare a transcription using our Whisper API. The clips are then deleted, which means that we do not use the audio clips to improve our models. We may allow users to share audio data to help improve our models in the future.

Transcriptions are used as inputs to ChatGPT and appear in conversation histories, which may be used to improve our models (depending on user settings).

Are voice conversations hands-free?

Once you enter a voice conversation it is hands free until you exit the voice conversation.

There are manual controls which allow you to pause, resume, and exit the voice conversation.

Do voice conversations include subtitles?

No subtitles are not included or displayed during in a voice conversation. After you exit a voice conversation the transcription is added to your current text based conversation with ChatGPT.

Enable the ability to have voice conversations

Settings → App → New Features → Voice conversations (toggle on)

Disable the ability to have voice conversations

Settings → App → New Features → Voice conversations (toggle off)

Start a voice conversation

To start a voice conversation, tap the headphones icon. Once the connection is established, ChatGPT will be listening for you to speak.

Pause the voice conversation

Tap the pause icon.

Interrupt the voice conversation

While ChatGPT is talking you can either Tap to interrupt:

Or you can tap the stop icon.

Resume the voice conversation

Tap the resume icon, and start speaking again.

Unmute the voice conversation

Tap to unmute.

Exit the voice conversation

To exit Voice Mode tap the X icon to end the voice conversation and return to the text based conversation with ChatGPT.

How long can I leave a voice conversation paused for?

No limit.

How many voice conversations can I have going at once?

You will stay in your current conversation until you start a new conversation or switch to another existing conversation.

Why am I receiving the response "Sorry, I cannot help with that"?

This happens due to our safety measures. If it seems like your prompt is in line with our Usage Policies then please send us that feedback through the thumbs up/thumbs down options in the chat.

Why does the voice input detect a different language from the one I’m speaking?

At times, the language you speak might not be accurately reflected in our voice input feature. You can specify a preferred language in Settings for a more accurate detection.

  1. Click on the "..." button on the top right hand corner, and then click on the "Settings" button.

2. Within the Settings page, scroll down to the Speech section. Click on the "Main Language" dropdown to select your language.

Did this answer your question?