This article was contributed by Taesu Kim, CEO of Neosapience
With the tremendous advances in how AI/ML technologies are being deployed, one of the most exciting, controversial, and rapidly evolving advances relates to human voice. One particular example jumps out as encapsulating the complex of issues and emotions tied to AI-powered voices.
Last summer, AI technology was used to give voice to some of the late Anthony Bourdain’s writings, words that he never spoke or read aloud but were nevertheless his; voice cloning technology brought the text to life in Roadrunner: A Film About Anthony Bourdain. Some in the audience felt duped that it wasn’t really Bourdain, others thought the move was a misstep as Bourdain was not alive to give permission to manipulate his voice in such a way, while many felt it was simply a creative storytelling device.
The Bourdain example highlights two key issues that will rise to the forefront of how AI-based voice technologies will be used in the future. On one hand, there are questions about who has ownership of a voice, and therefore control over how it might be used now and in the future. On the other is the ethical issue: is it morally right to allow someone’s voice to be used in the public domain after his or her death when he or she has no control over how it will be used or what is said?
These questions are surfacing because AI-based voice technology is beginning to really take off; there has been a tremendous amount of time and money spent on research and development to make machine-generated voices sound “real.” They are now capable of conveying the emotions, texture, cadence as well as the natural rise and fall and many other distinct markers that characterize human speech (not to mention song). This is game-changing, because it has become hard for listeners to determine the difference between the speech of a human and that of a machine.
As such, we’ve reached a defining moment in the technology’s development where we need to figure out basic guidelines and set guardrails, or like so many technologies before it, voice technology’s applications will be used in ways they were never intended.
Ownership of digital identities
We’ve become a global society thirsting for rich content experiences– whether it’s through film, television, and streaming services or user-generated mediums like YouTube and TikTok. And soon the metaverse will offer even more new ways to engage with content. All of these avenues present enormous opportunities for AI-powered voice, as well as video. AI-powered voice and video make it exponentially faster, easier, and less expensive to create content, not to mention adapt it for other languages. These technologies are also highly accessible through text-to-voice services, so essentially anyone can leverage AI for content creation without requiring a studio and a lot of fancy equipment, spurring high demand in the entertainment industry.
At the same time, there is a lot of fear surrounding the ownership and monetization of one’s virtual identity. In a world of deep fakes, misrepresentation, and identity theft, it is fair for individuals to wonder what happens if someone co-opts their digital identity for their own purposes. Not only would the individual lose control of how his or her likeness is deployed, as well as any revenue or brand recognition associated with it, but it could be used in inappropriate, even illegal ways —or so the thinking goes.
This is highly unlikely, however. Each human voice– as well as face– has its own unique footprint, comprised of tens of thousands, to millions, of characteristics. With advanced fraud detection and management technologies in development, AI-powered identities can be safeguarded relatively easily. What is far more complicated, however, is managing that digital identity over time. It becomes not just about business, but a series of ethical decisions that are inextricably intertwined.
The ethics of virtual representation and AI-powered identities
Was it okay for the director to use Bourdain’s digitized voice in his movie? The director allegedly obtained permission to use his AI-cloned voice to deliver the lines in question, but from whom? Who ultimately holds the right to decide?
Similarly, famous South Korean folk rock singer Kim Kwang-Seok’s AI-powered voice recently was used to release a new song. The artist has been dead for 25 years, but a broadcasting company brokered a deal with the artist’s family to use AI to clone his voice and deploy it for something entirely new, largely to the delight of the public. There are many other cases of entertainment companies and content creators seeking to bring the voice and likeness of famous people back for concerts or movies. But is it ethically responsible?
On the surface, it is something that can be addressed simply enough through licensing deals and contracts with the entertainer’s estate or, ideally, determined while the artist is still alive. As the practice becomes more common, we should be prepared to see a sort of name, image, voice, likeness clause within a person’s Will, particularly one that governs their posthumous wishes or appoints a manager for overseeing the career of their virtual self — much the same way they have a business manager in life.
Virtual identities are not just for celebs
It is one thing for celebrities to consider such content and management deals, but what about regular, everyday people? Perhaps those who grieve for loved ones, like this woman who lost her young daughter due to an illness? Meeting in a virtual reality environment, the woman was able to connect with her daughter in avatar form, seemingly traveling to a version of heaven and holding a birthday party. The experience is clearly quite meaningful to the young mother and her family, but the interaction is in no way real. Some companies — as well as consumers — want no part in developing such experiences because it takes liberties with the child’s likeness and personality, while others see a chance to provide comfort and closure to families in pain.
And what about creating new virtual experiences for education purposes, such as the award-winning Interactive Holograms: Survivor Stories Experience? At a time when students and citizens question whether the Holocaust was real or what being actually Nazi entailed, is there not room to use such technology for good? What lines are appropriate in terms of creative license?
Moving into an AI-powered future with AI-powered identities
There are no easy answers when it comes to a virtual, or ai-powered identity. We sit at the precipice of an entirely new means of content creation, where famous as well as regular people will soon be asked to think about how their voice and image could be used not just today but long after they are gone.
Virtual identity will become a currency that should be regarded similar to their physical assets, one in which they can specify their wishes in life and death, and appoint managers and executors to approve its usage moving forward. This may sound far-fetched, but digital voices don’t age, nor do avatars. With the metaverse going mainstream, our virtual selves can live well beyond our years.
It will become a new imperative that everyone determines and clearly defines parameters they are comfortable with in terms of their digital identity. Similarly, the companies that offer platforms for AI-powered voice and video creation need to develop clear policies for the adoption and use of a specific virtual ai-powered identity. Doing so protects both individuals and companies from tumbling down a slippery slope as highly disruptive AI-powered virtual identities become normalized.
Taesu Kim is the CEO of Neosapience