OpenAI has announced the alpha rollout of its new Advanced Voice Mode for a select group of ChatGPT Plus users, allowing them to speak more naturalistically with the AI chatbot on the official ChatGPT mobile app for iOS and Android.
On X, the company posted from its account that the mode would be available to “a small group of ChatGPT Plus users,” though the company added in a follow-up post that “We’ll continue to add more people on a rolling basis and plan for everyone on [ChatGPT] Plus to have access in the fall.”
ChatGPT Plus is of course the $20 per month individual subscription service OpenAI offers for access to its signature large language model (LLM)-powered chatbot, alongside other tiers Free, Team, Enterprise.
It was unclear how OpenAI was selecting the initial batch of users to receive access to Advanced Voice Mode, but it posted that “users in this alpha will receive an email with instructions and a message in their mobile app” for ChatGPT, so those interested would be advised to check there.
The feature, which was showed off at OpenAI’s Spring Update event back in May 2024 — what feels like an eternity in the fast-moving AI news and hype cycle — allows users to engage in real-time conversation with four AI-generated voices on ChatGPT, and the chatbot will attempt to converse back naturalistically, handling interruptions and even detecting, responding to, and conveying different emotions in its utterances and intonations.
OpenAI showed off a number of potential use cases for this more naturalistic and conversational Advanced Voice Mode, including — when combined with its Vision capabilities of seeing and responding to live video — acting as a tutoring aid, fashion adviser, and guide to the visually impaired.
Delayed but finally ready
However, the rollout of the feature was delayed from OpenAI’s initial estimate of late June following a controversy raised by Hollywood actor and celebrity Scarlett Johansson (Marvel’s Black Widow and the voice of the titular AI in Her) who accused OpenAI of attempting to work with her and then mimicking her voice even after she refused.
OpenAI denied any intentional similarity between its AI voice “Sky” and Johansson’s in Her was intentional, but pulled the voice from its library and it remains offline to this day.
On X today, the official ChatGPT App account acknowledged the delay, writing “the long awaited Advanced Voice Mode [is] now beginning to roll out!”
Mira Murati, OpenAI’s Chief Technology Officer, shared her enthusiasm about the new feature in a post on X: “Richer and more natural real-time conversations make the technology less rigid — we’ve found it more collaborative and helpful and think you will as well.”
Following a number of new safety commitments and papers
OpenAI’s official announcement highlighted its ongoing efforts to ensure quality and safety.
“Since we first demoed advanced Voice Mode, we’ve been working to reinforce the safety and quality of voice conversations as we prepare to bring this frontier technology to millions of people,” the company stated on X, adding: “We tested GPT-4o’s voice capabilities with 100+ external red teamers across 45 languages. To protect people’s privacy, we’ve trained the model to only speak in the four preset voices, and we built systems to block outputs that differ from those voices. We’ve also implemented guardrails to block requests for violent or copyrighted content.”
The news comes as the capability for AI to be used as a tool for fraud or impersonation is undergoing renewed scrutiny.
Though OpenAI’s Voice Mode doesn’t currently allow for new AI generated voices or voice cloning, the mode could presumably be used still to trick others who aren’t aware it is AI.
Separately, former OpenAI backer and co-founder turned rival Elon Musk was this week criticized for sharing a voice clone of U.S. Democratic presidential candidate Kamala Harris in a video attacking her.
In the months following its Spring Update, OpenAI has released a number of new papers on safety and AI model alignment (compliance with human rules and objectives techniques). The releases also follow the disbanding of its superalignment team and criticisms from some former and current employees that the company deviated focus on safety in favor of releasing new products.
Clearly, the slow rollout of Advanced Voice Mode seems designed to counter those criticisms and assuage users and possibly regulators or lawmakers that OpenAI is taking safety seriously and prioritizing it equal to or over profits.
The release of the ChatGPT Advanced Voice Mode also further differentiates OpenAI from rivals such as Meta with its new Llama model and Anthropic’s Claude, and puts pressure on emotive voice focused AI startup Hume.
The post OpenAI opens limited access to ChatGPT Advanced Voice Mode on mobile appeared first on Venture Beat.