Synthetic speech startup Murf lends a voice to content creators of all sizes 

Synthetic speech tech startup Murf gives a voice, literally, to content creators of all sizes. Murf, which now has a library of more than 120 human-parity AI voices across 20 languages, announced today it has raised $10 million Series A funding led by Matrix Partners. Participation came from returning investors Elevation Capital and several prominent angel investors like Ola founder Ankit Bhati; Disney Streaming SVP of product; Ashwini Asokan, the founder of Mad Street Dap; and Pushkar Mukewar, founder Drip Capital.

Founded in October 2020 by IIT-Kharagpur school friends Sneha Roy, Ankur Edkie and Divyanshu Pandey, Murf’s previous funding announcement was a $1.5 million seed led by Elevation Capital and angel investors who helped them recruit talent, invest in product innovation and user acquisition. Murf says that since its seed round, it has grown 26x in ARR and synthesized more than one million voiceover projects, in a variety of speaking styles and tones.

Some examples of how Murf’s technology has been used include a tech entrepreneur and artist creating an entire film using AI art models, deepfake programs and AI Voices from Murf studio; an entertainment animation agency that created a TV series using a collection of Murf’s voices; authors creating fantasy fiction audiobooks with Murf’s AI voices; and a YouTube influencer who used Murf’s AI voice to create a rap video.

Murf's founders

Murf’s founders. Image Credits: Murf

Edkie, the CEO of Murf, told TechCrunch that even though Murf’s founding team worked in different domains in the past, they all ran into the pain points of creating high-quality voiceovers. This included creating and updating product demos and recording radio and video ads. He added that the pandemic “provided a boost to multimedia creation and the demand for scalable audio content was growing rapidly.”

Murf’s clients have used it in a variety of ways, including advertising, audiobooks, explainer videos and e-learning. Murf.ai, its SaaS platform, was developed to make it easier for clients to create high-quality natural-sounding voiceovers for any commercial purpose. The company’s clients range in size from individual content creators to SMBs and enterprises, and work in sectors like education, corporate, healthcare, media and entertainment, marketing, advertising, podcasting, customer support and more. 

Edkie told TechCrunch that content creators and marketing teams often record voiceovers themselves, or outsource the entire process, both of which are “cumbersome, expensive and time-consuming.” Murf, on the other hand, lets users generate “human-like” voiceovers without needing to buy recording equipment or hire a voice artist. 

The company also wants to remove limitations on what text-to-speech can do. “While TTS has been around for quite some time now, limitations in voice quality have restricted its usage. By leveraging recent advances in AI and deep learning, we are making it possible to create high-fidelity synthetic voices that mimimize the natural prosody and pronunciation of human speech.”

Murf’s platform includes an AI-enabled SaaS tool that helps users generate “human-like” voices, typically for use in videos or presentations, without having to procure complex and costly recording equipment or hiring a voice artist. Content creators can use an online voice recording booth, where they can sample a wide array of speaking styles. Murf wants to bridge diversity gap in traditional text to speech platforms by including voices across accents, like African American, British, Australian and others. 

According to market reports used by Murf’s founders, the global text-speech market is expected to reach $7.06 billion by 2028, growing at a 14.6% CAG. Meanwhile, the voiceover and dubbing markets is predicated to generate a total of $8 billion annually by 2027.

Text-to-speech has been around for years, but quality limitations meant they were used primarily by voice assistants and chat bots. But recent developments in AI and deep learning now means it is possible to create synthetic voices that have the prosody and pronunciation of human speech. Murf’s AI engine is trained on hours of actual human speech and Murf Studios offer more than 120 human-variety AI voices, which can speak in 20 languages. Murf is also working toward bringing more diverse accents by partnering with voice actors to bring abroad voices like African American, British and Australian English.

Murf’s AI-powered text-to-speech can also learn from contextual information to return the right responses. The founders describe Murf as an “all-in-one-voice solution” that enables users to add images, videos and background music. It also has features for pronunciation using the international phonetic alphabet (IPA), voice customizations that change users’ pitch, pause, emphasis and speed.

Murf makes money through a subscription plan for its services. It came out of beta testing in January 2021, and over the last 18 months, has grown 22x in ARR and synthesized over one million voiceover project to date.

Edkie said that Murf’s main competitors are the large tech and cloud companies, like Google, Amazon, Polly and Microsoft, which have the leading text to speech platforms in the market. Murf sets itself apart with natural-sounding AI voices that also support multiple accents and styles. 

“Going beyond a simple text to speech tool, our platform offers the ability for users to add images, videos, presentations, and to the voiceover, include background music and sync them altogether to create compelling content,” said Edkie. Murf’s AI-powered TTS can also learn from large amounts of contextual information to create contextual speech. For example, it has an in-built context awareness that can recognize common used entity formats like numbers, currencies, percentages, addresses, dates and times, reducing their randomness and bringing them closer to a predefined standard, Edkie added. 

In a prepared statement, Elevation Capital co-managing partner Mukul Arora said, “AI-driven, life-like voiceovers are the next frontier in the text to speech market. Murf, with their stellar founding team and unique IP, is perfectly poised to gain a leadership position in this space. Their execution prowess and tech-first focus is evident in the solid traction and growth that they’ve demonstrated so far. We are really excited to double down on our partnership with Murf.”