Tavus taps generative AI to power personalized videos with voice and face cloning

Generative AI is already looking like the major tech trend of 2023. The ability to generate fresh content via algorithms has been thrust into the public consciousness by the likes of ChatGPT, a chatbot-style technology trained on large language models (LLMs) capable of producing essays, poems, lyrics, news articles and even computer programs. Then there’s DALL-E, from the same Microsoft-backed OpenAI that spawned ChatGPT, which serves a similar purpose but for visual creations instead.

While some have argued that ChatGPT signals AI’s arrival into the mainstream, the truth of the matter is that we’re just at the start of a new era of AI-powered applications that will transform just about every facet of industry, from consumer search and stock photography to real estate and content marketing.

It’s against that backdrop that a fledgling startup called Tavus is looking to make its mark by enabling companies to create “unique” videos tailored to a specific individual, but based entirely on a single initial recording.

The idea is that a sales and marketing team, for example, can issue an endless stream of video pitches to prospective customers, maybe based on textual data the prospect submitted through an online form. Or perhaps a headhunter will use the platform to send multiple personalized videos to potential candidates using data gleaned from their LinkedIn profiles.

Founded out of San Francisco in 2020 by CEO Hassaan Raza and Quinn Favret, Y Combinator (YC) alum Tavus today announced that it has raised $6.1 million in a seed round led by Silicon Valley investor Sequoia, with participation from a slew of high-profile backers, including Accel Partners, Index Ventures, Lightspeed Ventures and YC Continuity.

How it works

Any company looking to create multiple personalized videos will know that it’s an incredibly time-consuming, repetitive process: recording the same message with substantively the same content, but tweaked for different clients or candidates. That is what Tavus is looking to address: allowing users to create their own AI video templates in minutes and then generate an unlimited number of versions of a video from that original source.

The initial onboarding process requires the user — for example, a recruiter or sales executive — to record a 15-minute video based on a script provided by Tavus, which is used to train the AI. Then, the user records a template for each campaign they want to create.

Tavus: Reading a script to create a base template. Image Credits: Tavus

Using a web-based editor, users can then select which elements of the video they want to personalize, specifying each variable (e.g., company, executive name or location), adding in calls to action and so on.

Tavus: Personalizing script with variables. Image Credits: Tavus

Tavus also supports longer-form variables via ChatGPT-powered snippets for more personalized introductions, something that Favret says has been highly requested by its users. For example, the base script with which a video is created can be configured to include a one-sentence introduction generated from a specific individual’s LinkedIn profile.

Tavus: Generating personalized content. Image Credits: Tavus

In essence, Tavus is striving to replicate what mass-marketing software has been doing in the email realm for donkey’s years, bringing it to the more visually engaging world of videos. In truth, this could raise some red flags for some: Will people be receptive to a personalized sales pitch when they discover that the sender hasn’t really taken the effort to make a video just for them?

But more than that, there is perhaps something a little creepy about an AI-generated video that uses personal information gleaned from a database — something designed to be personal could ultimately come across as incredibly impersonal when the user finds out how it was made.

Questions raised by such scenarios will continue to arise as AI becomes more ingrained in our everyday lives. Favret is quick to stress that while sales and marketing are obvious use cases for its technology, it isn’t purely about those verticals — it’s seeing uptake from “an eclectic group of users,” including recruiters, university deans and C-level executives.

“There’s a common misconception that Tavus only works with sales and marketing teams,” Favret said. “While this is a focus of ours, we have users applying Tavus in innovative and powerful ways across the full customer journey. Many of our power users apply Tavus broadly across their organizations, including for customer success, product, recruiting and other go-to-market related functions.”

And who exactly is putting themselves forward for cloning?

“Typically, the user clones themselves, but it’s also common for businesses to have a central figure, such as an executive or spokesperson, record the videos to have a consistent face of the company,” Favret said. “Tavus is designed for all types of users to easily be able to clone themselves in minutes.”

Under the hood, Tavus says it uses machine learning to train a model on facial gestures and lip movements, creating a system that realistically mimics these movements in sync with synthesized audio.

As for deployment, companies can access Tavus in two main ways: Half its users use it through its web dashboard, while the others integrate it into their own systems via APIs or natively.

“We frequently see sales teams use and deploy Tavus directly through the platform, as they’re able to efficiently generate large batches of videos for campaigns,” Favret said. “Other teams will use Tavus in a more programmatic manner, integrating it directly within their systems. This allows users to create ‘event-driven’ workflows, where a Tavus video can be generated and sent after a trigger.”

By way of an “event-driven” example, if a potential customer were to submit a form on a company’s website, the company could automatically generate and send a custom Tavus video to that lead using data that the customer had submitted.

“This empowers companies to capitalize on opportune timing for maximum conversions without having to wait for a team member to record the video,” Favret said.

Misuse

At the top end of the generative AI spectrum, we’re seeing the likes of Microsoft and Google slug it out to see who can get their respective smarts into the hands of businesses and consumers the quickest, a battle that Microsoft seems to be winning at present. At the same time, we’re seeing a whole host of generative AI startups come to the fore such as GlossAi, which is using AI to help businesses easily create shareable marketing skits, while Typeface is doing something similar for marketing copy and image generation in the enterprise.

Specific to Tavus, there have been comparable companies out there for a few years already, such as Windsor, which does something similar albeit with a heavy focus on e-commerce. We also have London-based Synthesia, backed by a swathe of high-profile investors, which is more about creating digital avatars from text for use in training and how-to videos.

So it’s clear that even before all the generative AI hype of the past few months, there was a growing movement in the startup world in that direction, which Tavus is now looking to capitalize on. Indeed, in its short life so far, Tavus has garnered some fairly large customers, such as real estate data company CoStar and French tech scaleup AB Tasty.

However, it’s worth considering potential misuse of this kind of technology. For instance, is there anything stopping anyone from uploading a video of someone else talking, and then creating new videos from that template? Certainly, there’s no shortage of deepfake chicanery examples from across the video and voice spectrum. Some companies, such as South Park creators’ startup Deep Voodoo, are already raising VC cash for their deepfake endeavors. As this type of technology becomes more ingrained and normalized within society, there will be more questions about the ethical implications of all this, even if the underlying intentions are well-meaning.

According to Favret, Tavus has a built-in feature that makes it more difficult to fool the system, as it requires users to perform a voice-verification and record live on the platform.

“This means that users cannot upload videos of others talking,” Favret said. “Furthermore, users have all rights to their data and likeness, allowing them to delete or remove their videos at any time. We are also very intentional about how Tavus is used: we screen each use-case before a user starts with Tavus, ensure that the use case meets our community guidelines, and that it is ethical. Security and ethics are incredibly important to us, especially given the youth of this technology in society.”

Show me the money

In terms of business model, Tavus offers a basic intro plan aimed at smaller businesses that costs $275 per month and has restrictions such as the number of videos they can create. The custom “business” plan removes these restrictions, though there is no advertised price in place — it basically tailors the price according to how a company intends to use it.

“Tavus plans are customized to a company’s specific use-case and needs, but on a high level, we operate on a usage-based model, where users are charged based on a combination of the number of seats they have as well as the number of videos they generate,” Favret said.

Prior to this seed round, Tavus had raised a small amount as part of its participation in the YC program back in 2021. Its full roster of seed-round investors include: Sequoia, Accel Partners, Index Ventures, Lightspeed Ventures, YC Continuity, SV Angel, Hack VC, Remus Capital, Mantis Capital, Liquid2 Ventures, Zillionize, Soma Capital, GTMfund, Terra Nova and several undisclosed angel investors.