He left you no time to question further—do it NOW! Was it really your boss or a voice clone?

The potential uses of the deep-learning technology of voice cloning are immense. Imagine taking a sample recording of the way you speak, and having the computer generate any spoken words in a voice that is a digital twin of your own.

One report of the global voice cloning market by Data Bridge predicted the industry to reach at an estimated value of US$4446m between 2021 and 2028 with a CAGR of 25.74%. The forecast can be attributed to rising preference for machine-generated voices for interactive training and learning, personal digital voice assistants and numerous other uses.

For example, voice clones are a boon for people with motor neuron disease and are unable to speak or move their hands well. With the technology, they can be connected to an assistive technology that tracks their eye movements to help them communicate using a natural-sounding voice by selecting pre-recorded words and sentences. 

However, just like any technology that can be used for good, voice cloning can be abused if safeguards are not enforced.

“It wasn’t me”

Voice cloning enables scammers to go beyond more traditional approaches like phishing. In 2019, a CEO from the United Kingdom fell prey to an AI-generated deepfake voice that was recognized as his boss’ voice. The fake voice managed to convince the CEO to send US$250,000.00 to the scammer.   

Over and above being used as a tool for fraud, voice cloning can also be used as an alibi by a person under suspicion. In 2019, a Philippine Senate Inquiry played a recorded phone conversation involving a Bureau of Corrections (BuCor) official being accused of graft. The BuCor official had insisted it was not her voice heard even though the Senators could barely stop laughing, being convinced it was her voice.

One Senator finally warned the accused of perjury and stated that the Philippines’ Department of Information and Communications Technology had the means to determine if a voice was fake or not.

Earlier this year, Zimbabwe Vice President Kembo Monadi, rather than face the continued disgrace of being accused of making sexual advances on women, resigned from his post. The allegations stemmed from audio recordings featuring a voice similar to Monadi’s making lurid overtures to a female subordinate. He then claimed to be the victim of hacking and voice cloning. 

Safeguards in development

Of course, government and industry regulation have been the de facto way to deter abuse of voice cloning. China already has rules on video and audio content online: use of AI or virtual reality is required to be clearly marked in a prominent manner. 

However, with people acknowledging the obvious dangers of digitalized voice abuse, they also need to walk their talk. One person who has done this is John Costello, director of Boston Children’s Hospital’s augmentative communication program. He has suggested authentication in the form of an audio fingerprint layered over generated voices.

Then there is a New-York based company that has created algorithms that take into account a voice’s different features such as tone, frequency, prosody, and phrasing to determine if it originates from a human being or a machine, such as a loudspeaker.

Finally, if AI can clone a voice within five seconds, the same technology can also be used to quickly analyze collated voice samples and predict if they are merely generated.

So there you have it. Bane or boon, voice cloning is now in our midst. Sounds may not be what they seem. The next time you hear something that sounds like someone you know, be aware it could be a deepfaked voice used to trick you.