Descript raises $30M to build the next generation of video and audio editing tools

The popularity of podcasting and online video shows no signs of slowing down, and so we continue to see a wave of creators publishing a profusion of audio and video content to fill out the airwaves. Today, a company building a platform to make that work easier and more interesting to execute is announcing a round of growth funding to double down on the opportunity.

Descript, which builds tools that let creators edit audio and video files by using, for example, natural language processing to link the content to the editing of text files, has picked up $30 million in a Series B round of funding.

Andrew Mason, the CEO and founder of the company, said in an interview that the plan will be to use the money to continue building out tools not just for mass-market and individual professional and amateur creators, but also, increasingly, organizations that might be using the tools for their own in-house video and audio needs, a use case that has definitely grown during the last year of global remote working.

“We see ourselves… as an all encompassing platform for all media needs,” Mason said.

The company had early wins by signing on customers like NPR, Pushkin Industries, VICE, The Washington Post and The New York Times, as well as smaller and more modest media outfits.

Mason said that it’s also now seeing startups and bigger businesses using video for communication also adopting Descript tools, especially in cases where it makes more sense to visualize the answers, but the content could still use the ability to be edited.

“Whether it’s externally or internally, for things like bug reporting or personalized introductions or helpdesk videos, we’re seeing people using Descript for company video,” he added, “sometimes in place of something like an email.”

Spark Capital, and specifically Nabeel Hyatt (who in a past life co-founded a music games specialist, Conduit Labs, acquired by Zynga), led the round, with Andreessen Horowitz and Redpoint Ventures also participating (both backed Descript in its $15 million Series A in 2019).

A number of individuals — some investors, and some investors also famous for their own video, podcasting and publishing work — also participated this Series B, among them Devdatta Akhawe, Alex Blumberg, Jack Conte, Justine Ezarik, Todd Goldberg, Jean-Denis Greze, John Lilly, Tobi Lutke, Bharat Mediratta, Shishir Mehrotra, Casey Neistat, Brian Pokorny, Raghavendra Prabhu, Lenny Rachitsky, Naval Ravikant, Jay Simons, Jake Shapiro, Rahul Vohra, and Ev Williams.

The news comes on the heels of an eventful several months for the company. In October, Descript released its first major update to its editing suite by expanding from audio editing tools to cover video as well.

In an interview last week, Mason said that the feedback so far has been “excellent” for the technology, although he is declined to say how many users or usage Descript has had for this or its older audio technology.

Descript’s move expanding into the newer medium, in any case, makes a lot of sense, when you consider how closely aligned a lot of audio-based podcasting content has been with corresponding videos — with many of the most popular podcasters often posting videos of their recordings on YouTube and other platforms, for those who prefer to watch as well as listen to recordings.

It helps, too, that video is highly monetizable. Podcasting is on track to make more than $1 billion in ad revenues in the U.S. in 2021, according to the Interactive Advertising Bureau. Meanwhile, even in a year that was considered a downturn, digital video pulled in more than $22 billion.

That double-platform approach, however, has largely been executed on auto pilot up to now, as Mason points out, describing a lot of the video as “window dressing.”

“We watch a lot of video and podcasts and think about how we can create a tool that makes it fun and easy to craft great content,” Mason said. “One thing we’ve observed is that a remarkable amount of video is just audio with window dressing. You don’t notice it until you start looking through that lens. A ton of video is about what is happening with the audio, and so a lot of that video is just filler.”

A lot of the editing is no more than a series of jump cuts, he said, and notwithstanding other challenges like bad equipment, it’s just not a very exciting experience.

That lays the groundwork for Descript not just to create tools to make it easier to edit but in the future to conceive of how to do so in a way that creates a better and potentially more original product at the end of the process, too.

Mason’s turn to audio-based services for his two past startups — prior to Descript, he founded and eventually sold (to Bose) an audio-based city guide service called Detour — has been something of a left turn for a man probably still better known as the quirky co-founder of the once wildly popular sales platform Groupon.

However, Mason studied music at university, and if you talk to him, it is more than obvious that audio and sound-based experiences — not just music but the impact that aural experiences can have — are really where his passion lies.

Mason is long gone from Groupon, but he remains a bit of a wag. He is quick to quip that his ability to raise money for completely different concepts that are a world away from e-commerce are in no smart part due to his having already won the “startup lottery”.

And yes, like many jokes, it’s a telling and often true term, in my experience and observation. But in this case, I’d say it undersells some of the really interesting innovations that Descript has built and is building using innovations in AI to think about how to address some of the challenges that have emerged out of media production — at once so easy to do (so many creator platforms today) yet hard to get right.

More generally, audio technology is not only proving to be in demand with customers, but (as it happens) it is also being sought out larger tech companies, including (most recently) Amazon, Spotify, Apple, Google and Facebook, which are picking up a lot of smaller audio startups in their own efforts to build out their bigger media business.

And this is at the heart of why Descript has attracted this latest round of investment.

“We’ve been convinced of machine learning’s power to be used as a creative tool for some time,” Hyatt at Spark noted to me. “Descript is perhaps the best example of that in a startup today. The company takes some very complicated technology, but presents it in a way that’s actually easier to use than the status quo products. It’s very rare that you come across a company that uses technology to both empower a creative professional to work ten times faster, and simultaneously makes the creative process ten times easier for an amateur, growing the addressable market. Anyone editing audio or video, which is most of us nowadays, can see the benefits.”