Voicemod tools up with $14.5M to ride the generative AI (sonic)boom

Natasha Lomas

Updated February 24, 2023 at 9:12 a.m.·14 min read

The first thing we ask Voicemod's CEO and co-founder, Jamie Bosch, when he picks up the phone to talk about a new funding round is not something we're accustomed to asking -- but our question may become the norm in the generative AI future that's fast-flying at us: Is this your real voice?

Bosch's startup has been fiddling with audio effects for almost a decade, playing in the field of digital signal processing (DSP) -- where its early focus was on creating fun 'sound emoji' effects and reactions for gamers to spice up their voice chats. And gamers do remain its main user-base (for now). But the audio field is being charged by developments in AI -- which Voicemod's team is hoping will lead to whole new use-cases and many more users for its tools.

So where DSP technology was about applying effects to a person's (real) voice, developments in artificial intelligence are enabling startups like Voicemod to offer tools to create entirely synthesized (unreal) voices. And even the ability for users to 'wear' these voices in real-time -- so they can speak with a voice that isn't theirs. Think of it as the audio equivalent of a Snapchat lens or TikTok's viral teenage filter or Reface's celebrity face-swaps.

AI voice can even enable voice-shifting into another person's (real) voice. And not just for talking about the weather or shooting the shit. But for what's known as sing-to-sing voice conversion. Meaning you could get to sing in someone else's voice -- supercharging your karaoke game, say, by singing Bohemian Rhapsody as literally the voice of Freddie Mercury. And even switching between Mercury, May and Taylor, for the full mock opera effect if you have enough trained AI models (and microphones) on hand. Mamma-mia!

Artificial intelligence makes all this possible -- even if legal and ethical questions may create pause for thought about rushing to unleash real-time voice-shifting upon a world that still relies plenty upon fixed identities. (Banks pushing customers to record 'a unique voiceprint' to use as a password definitely need to sit up and start listening.)

Voicemod acquired another audio effects startup last year, called Voctro Labs, whose technology Bosch says it's working to blend with its own to create an amped up hybrid platform. The combo has already allowed it to expand what it offers -- launching a text-to-song feature last December which lets you turn your own lyrics into a vocal composition using generative AI. He tells us more is on the way -- including the aforementioned sing-to-sing feature.

Voctro's tech may be familiar as it was involved in the development of a voice clone of musician Holly Herndon which appeared in a viral Ted Talk last year -- in which her AI voice could be heard duetting with another musician (Pher)'s real voice in real-time. Which, well, if you haven't already seen it is quite the visual-audio spectacle, as well as being a mouthful to explain. It's also a taster of what Voicemod has coming to a keyboard near you.

"We're definitely going to launch more products and more ways for people to express themselves with the generative AI technology," Bosch tells us. "Not all Voctro Labs' technologies are related to music -- but they have a lot of technology related to singing, from this text-to-song technology to sing-to-sing technology in real time. So we have a lot of new projects and new products of upcoming.

"We are going to strengthen our speech-to-speech AI real-time technology, because we are basically merging our technology with their technology. We're basically creating an hybrid technology that will be better than ours -- or there's a mix of both... [So their sing-to-sing technology will be] combined with our DSP technology -- that we could use to do autotune. So we could potentially help artists with their voice and on the tone. And so this is, this is gonna be really, really interesting."

As well as providing direct-to-consumer/creator audio tools, it offers its technologies via SDK and APIs for third parties to integrate into their own products, from games and apps to hardware. So it's set up to distribute its tech across the gamer-creator ecosystem and have demand come find it.

Generative AI-powered disruption in audio of course mirrors (in a non-exact fairground 'crazy mirror' kind of a way) developments we're seeing happen elsewhere: Visually, to graphics and illustration, as a result of deep learning and the advent of prompt-based image generation interfaces (such as DALL-E and Stable Diffusion). Also to the written word, through the large language models that underpin generative AI chatbots like ChatGPT that can produce song lyrics or a whole essay on demand. And, indeed, in the case of musical composition -- where Google recently showed off a prompt-based generative AI song composer which can apparently produce arrangements that match the musical vibe you describe (although it said it's not releasing that particular generative AI model -- but surely someone else will).

It's clear that AI is bending the rules of what it's possible for a single person to create. And, well, as with freedom, the open concept, this is both thrilling and terrifying. Because, it's what you do with it that counts.

The coming years are going to be all about finding out what people do with such powerful AI tools at their fingertips.

Image Credits: Voicemod

Voicemod is positioning itself to ride this wave by building a toolbox for creators to survive and thrive in a reality-bending future and across a range of use-cases -- hence it's talking in terms of sonic identity and voice avatars for the social metaverse (at the future-gaze-y end) but also just helping you sound your sparkling best on a work Zoom call. So a sort of audio make-up as it were. Apply as needed.

"Now suddenly everyone can become a creator," predicts Bosch of the generative AI boon. "Everyone can come, basically, with no skill set. Or with no learnings on how to really craft those audios. They will be able to actually create those pieces of music. Songs. And this eventually evolves into into -- probably -- even voices. So the ability to create voices."

"This could potentially be something really viral for platforms like TikTok, or YouTube Shorts or Instagram... And this could eventually evolve into things like karaoke, for example. And be, I don't know, part of game consoles, or things like that, for people to use this to entertain. And, if we go a step further -- and it's the technology getting better and better as we think it will be -- this could potentially be a professional tool for people who want to create music. Or for people who want to create voices for movies or voices for games characters.

"We have a strong belief in user-generated content, and we are building tools for our users to start creating sounds and creating voices. And we will be putting technology in the hands of the users to create those [sounds]. And, eventually in the future, hopefully, they will go even to a professional level."

So while -- currently -- in order for the startup to synthesize a whole voice it does still involve a team of sound engineers and designers, Bosch suggests generative AI will put that power in the hands of the individual -- and it'll happen soon; "in the near future".

"I don't know if we'll be prompting -- now we're in this wave of everything is done through prompts -- I'm not sure if that will be the way or it will be more tools that will have AI technology embedded and we have user experiences that will make things a lot easier," he adds. "But definitely what I see from generative AI in the audience but also in the management phase is that suddenly everyone's can come become a creator, which I think is really interesting."

The birth of AI voice may not sound like amazing news for the employment prospects of sound engineers and designers (albeit, tech advances may simply create new requirements that just shift where their expertise is needed). But Bosch reckons that voice actors, at least, will still have a key role to play -- emoting for AI. Since robot voices aren't good at getting the pitch and intonation, or indeed emotion, right. It's a voice clone without a soul, basically. (Or as Nick Cave might put it, AI voice lacks 'its own blood, its own struggle, its own suffering' -- it lacks humanness.)

"I think that you will always need a human factor in your sample with these voices," suggests Bosch. "You could have the best voice -- of even a famous person -- but what really comes is the impression. You still need a human to do the cadence on the words. You still need a human to do the rhythm, the tone. So [it's not just that] I can speak normally and I will sound like a famous person -- no, you don't -- you still need to act a little bit. So... I think human factor for expression is key."

Might generative AI not be able to be learn to emote as well, with the right human data-sets -- and further dial up its mimickry so as to make us laugh or cry or love or hate on-demand too?

"Yeah. Well, we will see," responds Bosch. "I'm not sure. I mean, as of today, for me AI is a tool to be used by humans. But yeah, we don't know where this is going to evolve."

Voicemod for Desktop. Image Credits: Voicemod

Voicemod is gearing up for whatever phonic crazyiness lies ahead with a fresh tranche of funding. The 2014-founded startup has been revenue generating for years, via pro versions of its tools -- its main product, Voicemod for Desktop, has had more than 40 million downloads to-date, while Bosch says it has 3.3 million monthly active users -- but it's just closed $14.5 million in expansion funding, following an $8M Series A back in summer 2020. Madrid-based Kfund’s growth fund Leadwind, led the round, with participation from Minifund (Eros Resmini former CMO at Discord) and Bitkraft Ventures.

"We’re super excited by what generative AI can do to all creative industries and more specifically audio, especially when it comes to enhancing and augmenting the job that creative people already do," Jamie Novoa, partner at Kfund, tells TechCrunch. "In the past few months there’s been an explosion in generative AI in general and more specifically in audio but we think this is a phenomenon that’s just starting.

"What many of the cool technologies being launched to market lack are concrete and scalable business models attached to them, and Voicemod differentiates itself from the pack by having built a product used by millions of people on a daily basis and with significant revenue traction. We’re super excited about what Jaime and the rest of the Voicemod team have in the pipeline and what’s to come."

Voicemod says the extra funds will be used to enhance the development of its real-time AI voice identity capabilities -- and dial up its proposition for Gen Z, gamers, content creators, and professionals of all skill levels wanting tools to help them express themselves vocally in digital spaces.

Per Bosch, part of the reason it's taking more funding now relates to the acquisition of Voctro Labs. Beyond that, he says it's about making the most of the opportunities sparking off the Cambrian explosion in generative AI tools.

"We are in the middle of tremendous revolution in AI," he says. "We want to be well funding in order to be able to develop technology but also to be able to deliver technology to users. So I think one of our competitive advantages is that we already have the market and the traction and we basically are able to put this in the hands of the users. And I want to make sure to have enough runway, also due to market conditions, to be able to put all of this in place. So it will be mainly focused... on building the next generation AI technology and putting it in the hands of the users and also building these creation tools for the users to create content."

The first new tool will be landing next month -- with a launch of Voicemod's desktop product on macOS (currently it's PC only). The goal is to evolve into a multi-platform product spanning all devices. "We're also working on a creation tool mobile app that hopefully will see the light towards the beginning of next quarter. And, and yeah, some more stuff to come, hopefully," Bosch adds.

He also tells us the startup is working on a watermarking technology which it hopes to launch in Q2 this year -- to give platforms a way to be able to spot AI-generated voices in the wild.

Such a feature is likely to be a vital tool to counter all the possible negative use-cases (scams, fraud, manipulation, abuse, bullying, trolling etc etc) one could imagine humans coming up with for voice-shifting tools that let you sound exactly like someone you're not.

"It's an algorithm to watermark the audio," explains Bosch. "Moderation is is complicated because it really changes depending on the space... on which are the platforms where the audio is used -- so we believe that the channel is the one that should own that moderation and what we are doing is we will be providing this watermarking system in order for them to be able to know if the audio is created via synthetic voice or is created by a real voice."

"Every single new technology can be used for for the good or for the bad," he adds. "So we are of course putting some technology some tools in place to be able to have more control around a misuse of this technology."

On questions of licensing for training data, IP issues here are currently a grey area -- as the law hasn't caught up with developments in AI (let alone generative AI). That means startups operating in the space have to consider whether to make the most of total legal freedom to do whatever they want (and hope expensive consequences don't come clanging down on them in short order), or tread more carefully and thoughtfully. (Other startups in the space include the likes of Voice AI, Koe and ElevenLabs.)

Bosch claims Voicemod is taking the latter approach -- using (paid) voice actors to build up data-sets to train and hone its AI models. If it wants to make use of some original content he says the team will go to the IP provider and negotiate -- and figure out what kind of licensing terms they'd be up for. (The generative AI boom is also a crazy-thrilling time to be an IP lawyer, clearly.)

"We are basically pioneering here," he adds. "So a lot of things are without laws yet so we were trying to stick to our values, basically, and try to do the right thing. That's our approach on the data [side]. But yeah, you're completely, right -- there's no 'legal attachment' to your voice, as of today... We own our fingerprint. You don't own, like, whatever the fingerprint of your voice [is]. As of today.

"It sounds a little bit like science fiction but maybe, in the future, we will 'own' something related to our voice."

For the record, Bosch was talking to me with his actual voice. The company's real-time voice-shifting technology doesn't yet work over mobile. But he says that's coming too. So buckle up: The synthesized future is gonna be a screaming wild ride.

As ChatGPT hype hits fever pitch, Neeva launches its generative AI search engine internationally

The Daily Beast
‘The View’s’ Ana Navarro Uses Nude Melania Trump Photo to Defend Kamala Harris
Ana Navarro, a long-time co-host of The View, posted on her Instagram Thursday an old photo of nude Melania Trump as a way to troll her husband’s supporters, saying: “You wanna go low? ... I’ll happily go 20,000 leagues under the sea.”It was a picture from 2000 featured in British GQ, five years before Donald Trump married her.Navarro also included a picture of both Trumps partying with Jeffrey Epstein and Ghislaine Maxwell, also from 2000. Her explanation for posting these images was that it wa
The Daily Beast
FBI Is Not Fully Convinced Trump Was Struck by a Bullet
FBI Director Christopher Wray revealed during a marathon testimony on Wednesday that investigators still do not know if former President Donald Trump was grazed by a bullet or a piece of shrapnel during his attempted assassination.Twice during the hours-long session, Wray told lawmakers that the FBI was still working to determine what exactly struck the former president on his right ear during a rally in Butler, Pennsylvania. “My understanding is that either it [a bullet] or some shrapnel is wha
Good Housekeeping
Céline Dion Fans Won't Believe How Much She’s Getting Paid by the Olympics
Céline Dion and Lady Gaga are performing a duet at the 2024 Paris Olympics opening ceremony. Here's how much they are reportedly being paid for one song.
The Daily Beast
Donald Trump Seen in Public Without Ear Bandage
Donald Trump ditched his ear bandage for his meeting with Israeli Prime Minister Benjamin Netanyahu on Friday. The former president’s right ear returned to public life after being injured during the assassination attempt on the former president on July 13.The former president’s large bandage became an impromptu fashion statement during the Republican National Convention with some attendees donning DIY wound dressings. Following the convention, Trump swapped out his bulky white gauze for a thin n
BuzzFeed
Kamala Harris' Press Release About Donald Trump's Fox News Appearance Is Going Viral
"Something about the question mark after 'old and quite weird' is taking me out."
Rolling Stone
Harris Taunts Trump After He Backs Out of Debates
“What happened to ‘any time, any place’?”
Miami Herald
Ana Navarro just posted a racy throwback pic of Melania — and the Internet has opinions
The GQ spread appeared in 2000
HuffPost
Stephen Colbert Taunts Trump With Absolutely Brutal Reminder About Melania
The "Late Show" host mocked the former president over one curious claim.
The Daily Beast
Harris Campaign Trolls ‘78-Year-Old Criminal’ Donald Trump After Fox News Appearance
Kamala Harris’ campaign trolled Donald Trump after his appearance on Fox News Thursday morning with a statement attacking his age and criminal conviction.The Republican gave his two-cents to Fox & Friends on a range of issues over the course of a roughly 30-minute interview, variously describing President Joe Biden as a “problemmed man” and slamming Harris as “real garbage.” Harris for President quickly hit back, releasing a: “Statement on a 78-Year-Old Criminal’s Fox News Appearance.”“After wat
HuffPost
Trump Responds To Claims He's 'Cognitively Challenged' In Bafflingly Weird Way
The former president brought it up twice during a rally in North Carolina.
HuffPost
Alexandria Ocasio-Cortez Puts Elon Musk In His Place With Perfectly Patronizing Reminder
The New York legislator only needed a tweet to shut down the tech billionaire.
Hello!
Prince Harry reveals real reason he won't let Meghan Markle return to the UK
Prince Harry has revealed the terrifying reason he won't bring his wife the Duchess of Sussex, to the UK, in a shocking new interview. Find out more here...
HuffPost
Donald Trump's Critics Actually Agree With His Latest Wild ‘Instruction’
The GOP nominee's comment on Fox News prompted no end of snarky replies.
Popular Mechanics
A 2,000-Year-Old Sarcophagus Was Just Unsealed—and the Mummy Inside is Mind-Blowing
Experts working in the Tomb of Cerberus in Naples unsealed a 2,000-year-old sarcophagus—and the mummy inside was shockingly well-preserved.
Hello!
Selena Gomez jumps on the yellow swimsuit trend in romantic snap with Benny Blanco
The Only Murders in The Building star shared a series of stylish vacation snaps on Instagram
Hello!
Rita Ora just styled bedazzled latex lingerie with sheer tights
Rita Ora just made a case for latex lingerie while performing to 50 thousand people. See photos
People
Mick Jagger's Girlfriend Melanie Hamrick, Bandmates Mark His 81st Birthday with Touching Tributes: 'We Love You'
Melanie Hamrick, Ronnie Wood, Keith Richards and more toasted the rock icon with Instagram tributes on Friday, July 26
Yahoo News Canada
Jasper National Park engulfed in flames: Shocking before and after photos show famous Maligne Lodge burning as Alberta wildfire spreads
Canadians are sharing before and after images of Maligne Lodge at Jasper National Park in Alberta after wildfires engulfed the region.
People
Christina Hall Claims Estranged Husband Josh 'Took Items' from Home, Turned On Security Cameras After Split
The HGTV star alleges that Josh made an unscheduled visit to their Newport Beach home after filing for divorce in a legal filing obtained by PEOPLE
The Independent
Trump campaign finally reveals why ex president continues to profess love for Hannibal Lecter in speeches
Suggestions have ranged from the former president confusing the meaning of the word ‘asylum’ when talking about migrants to simple jokes

This $50 Amazon smartwatch does 'everything' a name-brand watch can do — trust me, I tried it

Voicemod tools up with $14.5M to ride the generative AI (sonic)boom

Latest Stories

‘The View’s’ Ana Navarro Uses Nude Melania Trump Photo to Defend Kamala Harris

FBI Is Not Fully Convinced Trump Was Struck by a Bullet

Céline Dion Fans Won't Believe How Much She’s Getting Paid by the Olympics

Donald Trump Seen in Public Without Ear Bandage

Kamala Harris' Press Release About Donald Trump's Fox News Appearance Is Going Viral

Harris Taunts Trump After He Backs Out of Debates

Ana Navarro just posted a racy throwback pic of Melania — and the Internet has opinions

Stephen Colbert Taunts Trump With Absolutely Brutal Reminder About Melania

Harris Campaign Trolls ‘78-Year-Old Criminal’ Donald Trump After Fox News Appearance

Trump Responds To Claims He's 'Cognitively Challenged' In Bafflingly Weird Way

Alexandria Ocasio-Cortez Puts Elon Musk In His Place With Perfectly Patronizing Reminder

Prince Harry reveals real reason he won't let Meghan Markle return to the UK

Donald Trump's Critics Actually Agree With His Latest Wild ‘Instruction’

A 2,000-Year-Old Sarcophagus Was Just Unsealed—and the Mummy Inside is Mind-Blowing

Selena Gomez jumps on the yellow swimsuit trend in romantic snap with Benny Blanco

Rita Ora just styled bedazzled latex lingerie with sheer tights

Mick Jagger's Girlfriend Melanie Hamrick, Bandmates Mark His 81st Birthday with Touching Tributes: 'We Love You'

Jasper National Park engulfed in flames: Shocking before and after photos show famous Maligne Lodge burning as Alberta wildfire spreads

Christina Hall Claims Estranged Husband Josh 'Took Items' from Home, Turned On Security Cameras After Split

Trump campaign finally reveals why ex president continues to profess love for Hannibal Lecter in speeches