Benchmark Backs Real-Time Data-Processing Startup Confluent

Companies nowadays are creating huge amounts of data, but harnessing it and making it useful still remains a problem for many businesses. A startup called Confluent, which was founded by members of LinkedIn’s data infrastructure team, hopes to solve that problem by building commercial tools around some open-source software they developed.

To go after the market, Confluent has raised $6.9 million in funding led by Benchmark, with new partner Eric Vishria joining the company’s board. Along with Benchmark, LinkedIn and Data Collective also invested in the round.

Confluent was founded by LinkedIn alums Jay Kreps, Neha Narkhede and Jun Rao, who built and maintained the open-source project Apache Kafka, which LinkedIn used internally to collect and manage data from different sources within its network. The software was built to unify data from multiple different silos in a low-latency, highly scalable way, giving organizations access to it in real time.

At LinkedIn, Apache Kafka was used to populate its Hadoop cluster with data that was used to power its activity stream, along with providing the company with operational metrics. Since its introduction, however, a number of other companies have adopted the tool to manage and analyze data across their own organizations.

The list of companies relying on Apache Kafka for various projects reads like a “Who’s Who” of the tech ecosystem: Twitter, Netflix, Pinterest, Uber, Spotify, Tumblr and Mozilla. The organizations are using the software for everything from real-time analytics to fraud prevention.

So obviously other companies found Apache Kafka useful, but not every organization has the engineering resources or tech savvy to work with open-source tools. Kreps said that over time the team behind Kafka ended up fielding calls from organizations that wanted to implement it, but weren’t used to working with open-source software and needed help getting up and running.

After doing free training for some of those businesses, the team decided to follow the lead of other open-source organizations by productizing and commercializing a series of tools that would make it easier for a wider range of companies to use Kafka.

“If you want to make something real, you can’t do that entirely through open source,” Kreps told me. “You need to make something that companies can use in an out-of-the-box way.”

Apache Kafka will remain open source, but Confluent is hoping to build products that will help organizations quickly get up and running with its tools.

“There’s a set of companies that prefer to do everything in-house, and then there are companies that are willing to pay for software,” Kreps said.

Obviously Confluent hopes to cater to that latter set, while continuing to improve the capabilities of what they built in Kafka. It’s a huge problem and a huge market, and one that the founders are intimately acquainted with. All of that is why Vishria wanted Confluent to be his first investment, despite being brand new to venture capital.

“When you first become a VC everyone tells you, ‘Go slow, and don’t make any investments right away,'” Vishria said. But he was impressed by the founding team and the product, as well as the momentum around use of Apache Kafka. That was enough for him to decide to make the investment, just a few days after meeting them.

With $6.9 million in backing, Confluent is well-capitalized to go after the market. More importantly, though, Vishria believes the team has the domain expertise to help solve the big data problem for organizations.

“So often you see founding teams that aren’t meant to do the mission of the company,” Vishria said. “But these guys came in and they built and their whole careers around solving this problem.”