AI

Voice.ai raises $6M as its real-time voice changer approaches 500K users

Comment

grapic depiction of white soundwaves on pinkish background
Image Credits: Bryce Durbin/TechCrunch

Services like Midjourney and ChatGPT have pushed the boundaries of how AI can create images and text out of basic text prompts. Now, audio appears to be the inevitable next frontier. Music generation based on word prompts, AI tutors for language learning and voice simulators have all seen developments in recent months. Voice.ai hopes to be a part of that conversation (heh) with technology that lets users change (and disguise) their voices in real time, and now it has raised its first outside funding on the heels of early growth.

With more than 480,000 users and a library of more than 50,000 voice filters, Voice.ai has picked up $6 million, funding that it plans to use to take its voice changing tech into new places.

Mucker Capital and M13 are leading the round. Before now, Voice.ai has grown by word of mouth — the startup has a Discord channel with more than 120,000 people — on the back of $3 million in self-funding.

Currently the company’s tools — available as apps for Mac, PC, Android and iOS — are getting adopted by gamers, content creators, Vtubers and others on TikTok, Zoom, Discord, Minecraft, GTA5, Fortnite, Valorant, League of Legends, Among Us, Skype, WhatsApp and other platforms. The Voice.ai interface lets them create a new voice, or select from some 50,000 different pre-created voices (created and shared by users like themselves), which can be used as-is or modified, to use live in supported platforms, or for recordings.

The plan is to use the funding to hire more technical talent and to build new SDKs and APIs to work with further platforms like Meta, Unreal and Unity; bring on multi-language support; and add in new applications like singing where voice is center stage.

The startup doesn’t single it out, but it will be interesting to see if it uses some of the funding also to increase server capacity.

That is no small burden. Anecdotally, we’ve heard that GPU pain is one of the biggest gating factors in how a lot of AI apps are able to scale at the moment. (It’s partly why you’re seeing big deals being made that include strategics providing processing and server capacity.)

For Voice.ai specifically, your voice is processed locally and channeled into wherever it will be used through what founder and CEO Heath Ahrens described to me as a “virtual audio cable.” But when you look at reviews of its apps, a common lament is that when you sign up you are put on a waitlist because “overwhelming demand has our servers at max capacity” with a promise that you’ll be informed when the service increases that capacity.

There are dozens of speech-to-voice and voice-to-speech services in the market today, and already a lot of activity among them: Last year Spotify acquired Sonantic and Snap bought an AI voice assistant even earlier than that; another startup, Sanas, is working on changing your accent and there are the voice simulators Murf and Acapela, among many others. Voice.ai counts itself in the same general category as Respeecher and ElevenLabs, two voice-to-voice AI startups, letting users apply masks to tweak or completely transform their voices — in some cases creating completely synthetic voices in place of the real thing.

Respeecher, founded and based in Ukraine, made a name for itself by helping build a new Darth Vader voice for new Star Wars installments, based on how James Earl Jones sounded 45 years ago when he originated the role. (In keeping with a character hell-bent on destroying worlds, Darth’s voice was delivered to the Hollywood client from its offices in Ukraine as Russia marched into the country.)

ElevenLabs — famously (or infamously as the case may be) — has built a platform that is frighteningly good at cloning voices, and earlier this month it picked up its most recent funding round of $19 million from a group of big-name investors.

Voice.ai is trying, in that mix, to position itself as the AI voice modifying app for Everyman.

“There are plenty of companies that are trying to provide a different flavor of voice tech to businesses,” Ahrens told TechCrunch in an email (ironically, it wasn’t possible to arrange a live interview with him). Ahrens has some experience with the building of B2B AI tech: his two previous companies — iSpeech for text-to-speech and Haystack for face recognition — are built around API offerings.

“What sets Voice.ai apart is that we are focused on bringing tech that was previously reserved for enterprise companies directly into the hands of consumers in an affordable fashion.” Many users, he noted, “come to us from classical DSP voice changers and voice modulators which they had been using in the past and which are still popular among many gamers and streamers.”

“Affordable” comes in two tiers, with most users now on a free service that requires them to opt in to providing computational power to train Voice.ai’s models, with its service built on its own private data set comprised of “millions of unique users.” No pricing is provided on the site: we’re asking for those details.

“We believe in making technology accessible and plan on working together with the open source community to democratize Voice AI technology,” added Ahrens.

Voice.ai also claims it takes what is a fundamentally different approach to the challenge of changing a voice, tapping into some of the ethos that has built up around the use of avatars by Vtubers, gamers and others online.

“Most voice AI companies that are coming into the space try to build scalable enterprise focused text-to-speech solutions or expensive voice-to-voice services for production studios,” Ahrens said. “We start from the opposite spectrum and try to deliver value to individuals who are looking to expand how they sound online. The core value proposition of our speech-to-speech AI isn’t that it can perfectly replicate any given person. It’s that it retains the core elements of a user’s speech: their emotion, pacing and emphasis while replacing the sound of the voice, in order to create a completely unique new end result, in real-time.”

It might be because of how the demographics in interactive platforms like gaming skew, but for now Voice.ai’s audience is 70% male versus 30% female with new categories opening not just around who is using the tech, but why.

That includes not just those using avatars and building voices to match them, or those looking for more privacy protection, but also, he said, “transgender users who can represent themselves with voices that match their identity, as well as users exploring completely new online personas for themselves.”

There is already a base of users tapping into Voice.ai’s direct-to-consumer offerings, but one of the reasons why Mucker is investing in the startup is because it believes that there is an opportunity to build out a network of developers using and integrating its tech.

“Voice.ai is poised to revolutionize the AI developer community in a manner akin to AdMob’s impact on the mobile app developer community,” said Omar Hamoui, a partner at lead investor Mucker Capital. (Hamoui previously founded the mobile ad startup AdMob, eventually acquired by Google, so he has some direct experience building mobile developer tools.) “By offering user-friendly solutions that were once exclusive to large enterprises, Voice.ai aims to democratize access for developers worldwide.”

Karl Alomar, the former COO of Digital Ocean, who led the investment for M13, said investors will be taking an active role in the next stage of development. “At Digital Ocean too we saw the value of building a community of builders by builders,” he said. “We’re excited for creators and developers to build on the Voice.ai platform.”

More TechCrunch

Peakbridge intends to invest in between 16 and 20 companies, investing around $10 million in each company. It has made eight investments so far.

Food VC Peakbridge has new $187M fund to transform future of food, like lab-made cocoa

For over six decades, the nonprofit has been active in the financial services sector.

Accion’s new $152.5M fund will back financial institutions serving small businesses globally

Meta’s newest social network, Threads is starting its own fact-checking program after piggybacking on Instagram and Facebook’s network for a few months. Instagram head Adam Mosseri noted that the company…

Threads finally starts its own fact-checking program

Looking Glass makes trippy-looking mixed-reality screens that make things look 3D without the need of special glasses. Today, it launches a pair of new displays, including a 16-inch mode that…

Looking Glass launches new 3D displays

Replacing Sutskever is Jakub Pachocki, OpenAI’s director of research.

Ilya Sutskever, OpenAI co-founder and longtime chief scientist, departs

Intuitive Machines made history when it became the first private company to land a spacecraft on the moon, so it makes sense to adapt that tech for Mars.

Intuitive Machines wants to help NASA return samples from Mars

As Google revamps itself for the AI era, offering AI overviews within its search results, the company is introducing a new way to filter for just text-based links. With the…

Google adds ‘Web’ search filter for showing old-school text links as AI rolls out

Blue Origin’s New Shepard rocket will take a crew to suborbital space for the first time in nearly two years later this month, the company announced on Tuesday.  The NS-25…

Blue Origin to resume crewed New Shepard launches on May 19

This will enable developers to use the on-device model to power their own AI features.

Google is building its Gemini Nano AI model into Chrome on the desktop

It ran 110 minutes, but Google managed to reference AI a whopping 121 times during Google I/O 2024 (by its own count). CEO Sundar Pichai referenced the figure to wrap…

Google mentioned ‘AI’ 120+ times during its I/O keynote

Firebase Genkit is an open source framework that enables developers to quickly build AI into new and existing applications.

Google launches Firebase Genkit, a new open source framework for building AI-powered apps

In the coming months, Google says it will open up the Gemini Nano model to more developers.

Patreon and Grammarly are already experimenting with Gemini Nano, says Google

As part of the update, Reddit also launched a dedicated AMA tab within the web post composer.

Reddit introduces new tools for ‘Ask Me Anything,’ its Q&A feature

Here are quick hits of the biggest news from the keynote as they are announced.

Google I/O 2024: Here’s everything Google just announced

LearnLM is already powering features across Google products, including in YouTube, Google’s Gemini apps, Google Search and Google Classroom.

LearnLM is Google’s new family of AI models for education

The official launch comes almost a year after YouTube began experimenting with AI-generated quizzes on its mobile app. 

Google is bringing AI-generated quizzes to academic videos on YouTube

Around 550 employees across autonomous vehicle company Motional have been laid off, according to information taken from WARN notice filings and sources at the company.  Earlier this week, TechCrunch reported…

Motional cut about 550 employees, around 40%, in recent restructuring, sources say

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: Watch all of the AI, Android reveals

Google Play has a new discovery feature for apps, new ways to acquire users, updates to Play Points, and other enhancements to developer-facing tools.

Google Play preps a new full-screen app discovery feature and adds more developer tools

Soon, Android users will be able to drag and drop AI-generated images directly into their Gmail, Google Messages and other apps.

Gemini on Android becomes more capable and works with Gmail, Messages, YouTube and more

Veo can capture different visual and cinematic styles, including shots of landscapes and timelapses, and make edits and adjustments to already-generated footage.

Google Veo, a serious swing at AI-generated video, debuts at Google I/O 2024

In addition to the body of the emails themselves, the feature will also be able to analyze attachments, like PDFs.

Gemini comes to Gmail to summarize, draft emails, and more

The summaries are created based on Gemini’s analysis of insights from Google Maps’ community of more than 300 million contributors.

Google is bringing Gemini capabilities to Google Maps Platform

Google says that over 100,000 developers already tried the service.

Project IDX, Google’s next-gen IDE, is now in open beta

The system effectively listens for “conversation patterns commonly associated with scams” in-real time. 

Google will use Gemini to detect scams during calls

The standard Gemma models were only available in 2 billion and 7 billion parameter versions, making this quite a step up.

Google announces Gemma 2, a 27B-parameter version of its open model, launching in June

This is a great example of a company using generative AI to open its software to more users.

Google TalkBack will use Gemini to describe images for blind people

Google’s Circle to Search feature will now be able to solve more complex problems across psychics and math word problems. 

Circle to Search is now a better homework helper

People can now search using a video they upload combined with a text query to get an AI overview of the answers they need.

Google experiments with using video to search, thanks to Gemini AI

A search results page based on generative AI as its ranking mechanism will have wide-reaching consequences for online publishers.

Google will soon start using GenAI to organize some search results pages