Over the weekend, the Twitter account for a London police department shared that officers had arrested suspects for possession of “a large amount of cannabis,” including a picture of what appeared to be some truly foul weed. The replies on the post quickly filled up with jokes about the cops confiscating hedge trimmings, loose leaf chamomile tea, and so on — but one tweet included a fake audio clip of President Joe Biden explaining that he actually prefers low-quality pot.
“I’m from Scranton,” the simulated Biden said. “What I’m smoking is dirt. So let’s get that straight, Jack. Pure brick. Ass. Okay?”
Of course, the real Biden has said no such thing (at least not publicly). But lately, virtual Biden has been spouting off — rather fluently — on a number of topics about which the president wouldn’t and/or shouldn’t have strong opinions. It’s all thanks to the advent of powerful AI tools for voice cloning, and one in particular that launched last month: Voice Lab, a platform created by the AI startup ElevenLabs.
For just $5, you can access ElevenLabs’ software and start creating your own audio. All it needs to realistically mimic an actual person — and then read any text in their voice — is a short sample of that person’s speech, which it uses to create a “clone.” Recordings of Biden’s voice are abundant, so it’s easy to build your AI copy of him. Then it’s a matter of typing out whatever you want him to say. Perhaps you’ve imagined a gamer Biden who rants about “lame-ass bitch boys” playing as “no-skill heroes” on Overwatch? Now you can bring that character, “Joe Rizzler,” to life.
While clips like this are clearly artificial and seemingly innocuous, the potential abuse of technology like this is cause for concern. ElevenLabs itself, shortly after Voice Lab entered its beta phase, tweeted that they’d seen “an increasing number of voice cloning misuse cases” and would take steps toward “implementing additional safeguards.” The company stopped short of describing specific misuse, but around the same time, 4chan trolls were using the tool to generate fake audio of various public figures spewing hate speech. One snippet sounded like actor Emma Watson reading from Hitler’s Mein Kampf. And earlier this month, in a similar deepfake combined with doctored video footage, Biden appeared to give a speech denouncing transgender people. Needless to say, that never happened.
While we can trace back any generated audio back to the user, we’d like to address this by implementing additional safeguards.— ElevenLabs (@elevenlabsio) January 30, 2023
ElevenLabs did not respond to a request for comment regarding what new safety measures they’ve added or plan to apply to Voice Lab. Nor did they address whether they view the non-offensive AI Biden speeches — like one shared by Twitter user @MNateShyamalan, in which he gets philosophical about the sitcom Young Sheldon — as examples of “misuse.” The eventual impacts of voice cloning remain to be seen; improvements to the technology will no doubt complicate efforts to separate genuine recordings from bogus material designed to push an agenda or false narrative. Right now, though, shitposters are reveling in the apolitical words they can put in the president’s mouth.
got too high last night and made this for some reason. idk pic.twitter.com/HZZDeIazz7— soul nate (@MNateShyamalan) February 18, 2023
Why is it so amusing, or enticing, or even comforting to hear a Biden bot spitting references to drugs and video games? Well, for starters, it’s not exactly presidential. Then there’s the fact that the Biden himself is 80 years old — youthful vernacular is outright absurd in his grandfatherly cadence. For Zach Silberberg, a digital content producer whose AI Biden speeches have made a splash on Twitter (one has him discussing the Matt Damon film We Bought a Zoo, and in another he complains of being “inverted” like the characters in Tenet), the element of anarchy is essential.
“This version of the president is unpredictable,” Silberberg says. “We have no idea what the hell he’s going to say. It could be silly, or vulgar, or rambling, or anything at all. He’s not slow and boring and more of the same neoconservative version of a Democrat we’ve been conditioned to accept as the most liberal version of a president we can get.”
And ultimately, notes Silberberg — who has also made comedic fake dialogues between Joe Rogan and Ben Shapiro using ElevenLabs’ software — the Biden riffs are about getting one over on the most powerful man in the world. “It’s a fantasy for a president who we can laugh at, and also torture. Who doesn’t want to see Joe Biden get stuck in the house from a horror movie? It’s just plain funny.” Silberberg clarified that he is “anti-AI” and that “AI is bad.” In that sense, his videos can be construed as warnings.
joe biden has really messed up big time pic.twitter.com/pAGHZwDcyi— zach silberberg (@zachsilberberg) February 19, 2023
So as ElevenLabs touts their technology’s more mundane achievements — like an ability to craft a totally synthetic voice that will read The Great Gatsby aloud — a significant part of their user base to date is more interested in mischief, ranging from hysterical and harmless to strategic and malicious. Short of manually approving or denying each voice cloning request (an option they’ve considered), there’s probably not much the startup can do to crack down on voice clones of our presidents past and future. That could be disastrous for democracy as we know it, but on the other hand, it means we get to hear Trump and Biden bickering like teenage boys over a multiplayer first-person shooter. Which is maybe more accurate to our political moment than we’d like to admit.