As of December 12, 2024, we have released video, screen share, and image uploads in advanced voice in our latest mobile apps (app versions 1.2024.337 for Android and 1.2024.339 for iOS). These features have been rolled out to all Team and most Plus and Pro users, except for those in the European Union, Switzerland, Iceland, Norway, and Liechtenstein.
General FAQ
What are voice chats?
Voice conversations allow you to have a spoken conversation with ChatGPT, enabling a more conversational and natural interaction. You can ask questions or have discussions through voice input and receive a spoken response from ChatGPT. Voice conversations are available in ChatGPT mobile apps, desktop apps, and on desktop web at ChatGPT.com.
We have two types of voice conversations, standard and advanced.
Advanced voice is available to Plus, Pro, and Team users, and a monthly preview of advanced voice is available to Free users. Advanced voice uses natively multimodal models, such as GPT-4o, which means that it directly “hears” and generates audio, providing for more natural, real-time conversations that pick up on non-verbal cues, such as the speed you’re talking, and can respond with emotion. Usage of advanced Voice (audio inputs and outputs) by Plus, Team, Enterprise, and Edu users is limited on a daily basis. Free users can access a monthly preview.
We are currently rolling out video, screen share, and image upload capabilities in advanced voice in ChatGPT iOS and Android mobile apps. Video, screen share, and image upload capabilities will be available to all Team users and most Plus and Pro users, except for those in the European Union, Switzerland, Iceland, Norway, and Liechtenstein. We expect this rollout to be completed over the next week. Usage of video and screen share capabilities is limited for all eligible plans on a daily basis. Usage of image uploads counts towards your plan’s usage limits. Learn more about Vision message limits.
Standard voice is available to all signed in ChatGPT users. Standard voice uses several models to generate its response, including transcribing what you say into text before sending it to our models for response. While standard voice is not natively multimodal like advanced voice, standard voice conversations also use GPT-4o alongside GPT-4o mini. Each prompt in standard voice counts towards your message limits.
Voice conversations may make mistakes, so please check important information. Access to advanced voice and the associated usage limits are subject to change.
How do I start a voice conversation?
On Mobile
To start a voice conversation, select the Voice icon on the bottom-right of the screen:
When you begin an advanced voice conversation, you will be taken to a screen with a blue orb in the center.
Please note that conversations using standard voice have a black circle in the center.
When you are having a voice conversation, you can mute or unmute your microphone by selecting the microphone icon on the bottom-left of the screen.
If you are not on an app version supported by advanced voice, instead of the new mute / unmute buttons you will see see the headphones entrypoint icon:
You can end the conversation by pressing the exit icon on the bottom-right of the screen.
If you start a voice chat for the first time, or if you are using advanced voice for the first time you will also be asked to pick a voice. Please note that the volume of the voice in the selector may be different from the volume during the voice conversation. You can change your voice any time in settings, and advanced voice users can also change the voice from within voice mode using the customization menu in the top right corner.
Please note that you will need to provide the ChatGPT app Microphone permission to use this feature.
On Web
Voice conversations are available on desktop web at ChatGPT.com.
To start a voice conversation on chatgpt.com, select the Voice icon on the bottom-right of the screen:
If this is your first time using advanced voice on your browser, you may need to provide your browser permission to access your device's microphone.
When you begin an advanced voice conversation, you will be taken to a screen with a blue orb in the center.
Please note that conversations using standard voice have a black circle in the center.
When you are having a voice conversation, you can mute or unmute your microphone by selecting the microphone icon on the bottom-left of the screen.
You can end the conversation by pressing the exit icon on the bottom-right of the screen.
If you start a voice chat for the first time, or if you are using advanced voice for the first time you will also be asked to pick a voice. Please note that the volume of the voice in the selector may be different from the volume during the voice conversation. You can change your voice any time in settings, and advanced voice users can also change the voice from within voice mode using the customization menu in the top right corner.
How do I share my video with ChatGPT while having a voice conversation?
Video is rolling out in advanced voice on iOS and Android mobile apps only.
You can share video from your devices at any time during a voice chat by selecting the camera button at the bottom of the screen.
You can press this button again to stop sharing your video with ChatGPT. Please note that ChatGPT may respond to content from your camera automatically. In addition, please note that after you stop sharing, ChatGPT may still reference content you shared previously in your conversation.
How do I share a photo or my screen with ChatGPT while having a voice conversation?
Screenshare and image uploads are rolling out in advanced voice on iOS and Android mobile apps only.
You can press the three dots button and select Share Screen from the pop-up menu to share an image or your screen with ChatGPT.
Choosing the option to take a photo will bring up your camera so you can take and upload it to your voice conversation right away. Choosing the option to upload a photo will allow you to choose from the images on your phone to share with ChatGPT in your voice conversation
Selecting share screen will bring up your phone’s screen share options, allowing you to broadcast your screen to ChatGPT.
Once you’ve started screensharing, you can press the screenshare button again to stop sharing your screen with ChatGPT.
Please note that ChatGPT may respond to content that you’ve shared from your camera or screen automatically. In addition, please note that after you stop sharing, ChatGPT may still reference content you shared previously in your conversation.
How many voice options are available?
Choose from nine lifelike output voices for ChatGPT, each with its own distinct tone and character:
Arbor - Easygoing and versatile
Breeze - Animated and earnest
Cove - Composed and direct
Ember - Confident and optimistic
Juniper - Open and upbeat
Maple - Cheerful and candid
Sol - Savvy and relaxed
Spruce - Calm and affirming
Vale - Bright and inquisitive
Until early 2025, you can also now interact with a Santa voice in ChatGPT. Learn more about chats with Santa.
For how long can I have voice chats (audio only)?
Your daily use of audio in advanced voice for Plus , Team, Enterprise, and Edu users is subject to a limit each day, and daily limits may change. We provide a notice as you are approaching the daily limit. Plus, Team, Enterprise and EDU users will be notified when they have 15 minutes left of advanced voice audio for the day. Free users have access to a monthly preview to try advanced voice. Pro subscribers have unlimited access to advanced voice audio, subject to abuse guardrails. Learn more about our Pro plan and associated limits.
Once the advanced voice audio daily limit is reached, the conversation will immediately end and you will be able to continue your conversation using standard voice.
Standard voice shares message limits with the underlying model used to generate a response. Learn more about message limits in ChatGPT.
For how long can I use video and screenshare in my voice chats?
Per user, usage of video and screen share capabilities is limited for all eligible plans on a daily basis. We provide a notice as you are approaching the daily limit.
Note: Once the advanced voice audio daily usage limit is reached, you will no longer be able to share new video or screen share content until your usage limit resets.
Usage of video and screen share capabilities is also limited on a per-conversation basis. If you reach the conversation limit, you will be able to start a new chat to continue using video and screenshare until you reach your usage limit.
How many image uploads can I use in my voice chats?
Usage of image uploads counts towards your plan’s Vision message limits.
Can I keep a conversation going in the background while I am in other apps or with my phone screen locked?
Yes, you can keep a conversation going in the background in both standard and advanced voice by toggling “Background Conversations” on in settings.
Can I resume a previous conversation I had with voice mode?
Advanced voice conversations can be resumed in advanced voice, text, or standard voice. Conversation with text or standard Voice conversations cannot yet be resumed in advanced Voice Mode.
Conversations with standard voice can be resumed at any time with standard voice or text, but cannot be continued with advanced voice.
Do you have any tips for preventing interruptions with advanced voice conversations?
Occasionally, interruptions may happen during a conversation with advanced voice conversations. We recommend having advanced voice conversations with headphones.
On iPhone, enabling Voice Isolation mic mode can help to avoid unintentional interruptions. You can enable Voice Isolation by opening your Control Panel while having an advanced voice conversation, selecting Mic Mode, and switching to Voice Isolation.
If you are still experiencing issues, we recommend closing your app and restarting, turning up the volume of your assistant, or moving to a quieter environment.
Please note that the advanced voice conversation experience is not yet optimized for use with in-car bluetooth or speakerphone.
Can I have voice conversations with GPTs?
Standard voice conversations are available with GPTs. GPTs have their own voice option named Shimmer that is distinctly different from the nine output voices available to use when having voice conversations with ChatGPT.
Advanced voice conversations are not yet available for use with GPTs. If you attempt to have an advanced voice conversation with a GPT, you will be invited to start a new chat using standard voice.
Can I access my memories and custom instructions in voice conversations?
In both standard and advanced voice modes, you can create and access memories, as well as access custom instructions.
Can I generate musical content with voice conversations?
No. To respect creators’ rights, we’ve put in place several mitigations, including new filters, to prevent voice conversations from responding with musical content, including singing.
How do I change voices during a chat with Advanced voice mode?
You can change your voice in settings or, if you have access to advanced voice, from the customization menu in the top right corner of voice mode.
Voices are set per conversation. If you change your voice within voice mode, you will be prompted to start a new chat.
Can I browse resources from the internet with voice mode?
As of 12/16/2024, both Standard and Advanced voice voice can access resources from the internet to supplement its response.
I’m located in a country that has advanced voice, but I don’t have video, images, or screen share controls in my ChatGPT mobile app yet. What should I do?
As of December 12, 2024, we have released video, screen share, and image upload capabilities in Advanced Voice to all Team and most Plus and Pro users, except for those in the European Union, Switzerland, Iceland, Norway, and Liechtenstein. Make sure you are on the latest ChatGPT mobile app to use video, screen share, and image uploads as soon as they roll out.
Why do the voice transcripts sometimes not match the conversation I had?
Advanced voice conversations are inherently multimodal, allowing for audio exchange between you and the model. As a result, when this audio is transcribed, the transcription might not always align perfectly with the original conversation.
Is there a volume limit I can set for voice conversations?
Volume will be set on the device itself. There is not a volume limit for voice conversations as a setting in ChatGPT.
How can I leave feedback on my voice conversation?
All users having voice conversations will see a banner after their voice conversation has ended that prompts the user to provide feedback on their voice chat.
Only users on Free, Pro, Plus, and Team accounts will see the options to rate with the thumbs up/down included in that banner.
While Enterprise and Edu users will see this banner after ending a voice conversation, their banner will not include the rating options thumbs up or down.
Do voice conversations include subtitles?
No subtitles are not included or displayed during a voice conversation. After you exit a voice conversation, the transcription is added to your current text based conversation with ChatGPT. You can refer back to the transcription of your conversation in your chat history on the left-hand side of the ChatGPT app on web and desktop and in the menu on the left-hand side of the ChatGPT mobile app.
How many voice conversations can I have going at once?
You can only have one voice chat at a time.
Why am I receiving the response "Sorry, my guidelines won't let me talk about that" during a voice conversation?
This happens due to our safety measures. If it seems like your prompt is in line with our Usage Policies then please send us that feedback through the thumbs up/thumbs down options after the chat.
Why does voice mode or dictation detect a different language from the one I’m speaking?
At times, the language you speak might not be accurately reflected in our voice input feature. You can verbally correct the model to speak your language of choice. For standard voice or dictation, you can also specify a preferred language in the app Settings for a more accurate detection.
Open the sidebar by selecting the two lines on the top-left of the screen, and select your name at the bottom to open the Settings.
In the Settings page, scroll down to the Speech section. Click on the "Main Language" dropdown to select your language.
Privacy & Controls
How long do you retain audio and video clips from my voice chats?
With advanced voice conversations, audio and video clips from your voice chats are stored alongside the transcription that appears in your chat history. We provide a visual indicator in the chat history that shows which chats happen with advanced voice mode: just look for the grayed out text and the small microphone or camera.
Audio and video clips from your advanced voice chats will be retained for as long as the chat is part of your chat history. When you delete the chat, we’ll also delete the associated audio and video clip within 30 days unless we need to keep it for security or legal reasons or if you previously shared your audio or video clips with us to train our models and the audio or video clip was previously disassociated from your account.
You cannot recover chats once you delete them. If you want to remove a chat from being visible in your chat history but retain it in your account, you should use the archive function. Audio and video clips associated with archived chats continue to be retained.
Please refer to this article to understand how content may be used to train our models and the choices that you have.
With standard voice mode, audio clips from ChatGPT are transcribed before we generate a response. We delete audio clips once transcription is complete, unless you’ve chosen to share your audio clips to train our models. (Note: Audio clips are deleted even if the transcription itself fails.) Learn more about sharing your audio to train our models.
Do you train your models on audio or video clips from voice chats?
Nope, unless you choose to share audio or video clips from voice chats for us to train our models.
If you have “Improve the model for everyone” enabled, then we may use transcripts and other files (such as images uploaded to the conversation) from your voice chats to train our models, depending on your choices and plan. But we won’t use the associated audio or video clips to train our models unless you have shared it with us for model training. Learn more about your choices.
Sharing audio or video to train our models
By default, we won’t train our models with audio or video clips, including screensharing clips, from voice chats. But Free, Plus, and Pro users may choose to share audio and video clips from their voice chats to help us train our models by enabling “Improve the model for everyone” in Data Controls and toggling on “Include your audio recordings” and “Include your video recordings.”
You can also respond affirmatively when we invite you to share audio and video clips for training.
If you have “Improve the model for everyone” enabled, we may use transcripts and other files (such as images uploaded into the conversation) from your voice chats to train our models, even if you are not sharing audio or video clips from your voice chats.
Who can share audio and video to train models?
ChatGPT users on Free, Plus and Pro plans can share audio and/or video from personal workspaces. Users cannot share audio or video from voice chats in ChatGPT Team, Edu, and Enterprise workspaces.
What happens if I share my audio and video to train models?
If you choose to share your audio and video from voice chats, then going forward, we will use audio and video from your conversation train our models. Audio from your tandard voice chats will also be used. We will take steps to reduce the amount of personal information in audio and video from voice chat that is used to train our models. Learn more about how we use your content to train our models. It may be necessary for our team to review the audio or video clips that you’ve shared with us in order to use it for training. For example, we might have a human on our team listen to an audio recording associated with a “thumbs down” feedback signal to identify where ChatGPT might have misinterpreted what was said.
How can I stop sharing audio and video?
You can stop sharing through the data controls page in your ChatGPT settings. Just disable the “include your audio recordings” or “include your video recordings” toggles, or turn off “Improve the model for everyone” entirely.
What happens if I decide to stop sharing my audio or video?
If you choose to stop sharing, then audio or video from new voice chats will no longer be used to train our models. Audio and video that was previously disassociated from your account may continue to be used to train our models. Prior to using audio or video clips from voice chats for training, we take steps to reduce the amount of personal information in the clip.
If you stop sharing your audio or video from your voice chats, we may still use transcriptions and other files, like image uploads, from those chats to train our model if you have “Improve the model for everyone” enabled. To opt out of training our models entirely, please disable “Improve the model for everyone.”
Is my choice to share audio or video for model training a device-specific setting?
Your choice to share audio or video from voice chats for model training is tied to your account. If you choose to share, then that choice will also apply to other devices where you are logged in. You can stop sharing audio or video through your Data Control settings in ChatGPT.