Cultivated data is the next Gold Rush

Five years ago, Frank Meehan, my SparkLabs Global Ventures co-founder, described the goal of our seed-stage fund as follows:

“The future is data. We are looking to invest in companies that are generating valuable data around usage patterns, customer behavior, company information.”

It was prescient — it has guided us well over the years, but also allowed us to look at relevant startups with a critical eye. During the first three years of our fund, we would look at startups — especially in the Internet-of-Things space — that would collect millions of data points, but most companies weren’t willing to pay for such data. Although industries such as insurance are built on data and information, many industries are just beginning to grasp the importance of such insights, especially as our lives integrate into the digital world.

These past few years, I’ve seen a general trend of startups improving how they collect, analyze and present data across numerous industries, and Fortune 1000 companies becoming more willing to pay for such cultivated data.

Industrial manufacturing, search and social media data and a handful of other verticals are long-established gold mines for data information and analytics. What we’re seeing now is that across our portfolio of more than 250 startups, data and analytics is finally being valued and becoming mission critical: It is no longer “just another tool” to have in the toolbox, but is key to a company’s success.

Cultivated data is gold

I define “cultivated data” as existing data (i.e. ERP data, Google Analytics, public health data, inventory data) that is analyzed and developed into a more usable form than it was before. This doesn’t have to be the complex data sets using inordinate amounts of computing power that signifies “big data,” but approaches and techniques to data sets that previously weren’t utilized. Cultivated data isn’t always about volume, variety or velocity of data — it’s more important for the output to be relevant and actionable.

One of our first SparkLabs Global Ventures investments in this space was 42 Technologies. Retailers such as Rebecca Minkoff, AllSaints, Faherty Brand and others have found 42 Technologies’ data analytics invaluable. When 42 Technologies graduated from Y Combinator, it primarily analyzed point-of-sale data to find diamonds in the rough in retailers’ inventory. Today, the company has expanded to using wholesale sell-in data, sell-through data, warehouse inventory data and other data sets to provide multiple insights to retailers.

Even for companies whose core product isn’t data, the data they have access to has become extremely valuable, so new revenue lines are being created. We’ve seen this in less expected areas — ranging from niche e-commerce to pet food to consumer reviews — where for some of these companies, data has become one of the primary sources of revenues.

For example, Vizio, a large consumer electronics manufacturer (more than $3 billion in revenue), has accumulated the largest single source of opt-in smart TV viewing data available; it launched an influential subsidiary around this business called Inscape.

The new data aggregators

This new age of cultivated data has created and will create new data aggregators. Instead of traditional startups attempting to disrupt the middleman, these new startups are becoming the middlemen of data insights.

A mobility data management and analytics startup called Populus (a SparkLabs Global Ventures portfolio company) aggregates rideshare, scooter share, bike share, traffic, public transit and other mobility source data to present actionable insights for city and transportation planners. Most cities would not have the resources or knowledge to do what Populus does.

One of our SparkLabs Korea accelerator investments, Chartmetric, is rapidly becoming the go-to resource for the music industry in today’s streaming world. It has become a new data aggregator, as company founder and CEO Sung Cho describes, because Chartmetric “distills the data and distills further until they get something actionable” for its customers. Additionally, Chartmetric has become a trusted source of data and data insights, as different music labels and bands might report their numbers quite differently.

In the years to come, we expect to see more of these new data middlemen — because of similar “trusted source” issues, the shortage of good data scientists and some will want to create their own future and launch their own startups.

No data scientists is the new data scientist

The lack of AI experts is making it hard for even Fortune 500 companies to recruit them, with Google, Facebook and other top tech companies hoarding such talent. And it’s not only great AI developers, but even data scientists, whose positions are becoming harder to fill. One outcome is the rise of analytics platforms that empower people to become their own data scientists.

For example, companies such as ThoughtSpot (raised $300 million from Lightspeed, Khosla and others), Rockset (raised $21 million from Greylock and Sequoia) and more specialized plays such as Falkonry (one of our portfolio companies) have each taken different approaches to the market. ThoughtSpot provides real-time analytics and search and query capability across multiple sectors. Rockset seems focused on search and analytics query services for large enterprises. Falkonry focuses on predictive analytics for industrial operations, a much narrower focus than the other two examples.

This analytics platform space will only heat up in the coming years, and I expect other new approaches to fill this lack of talent and capabilities within company walls.

Drilling for data all over the world

One interesting thing is how our firm has seen some governments spurring more innovation within the data space. In South Korea, the Korea Data Agency, which was established in 1993, has over the past couple of years been encouraging the development of a data marketplace. Some of our SparkLabs Korea portfolio companies get paid a few hundred thousand (USD) per year to open up their data to the public, and the Korea Data Agency has created vertical consortiums to encourage standard building for data structures within specific industries such as finance, healthcare and transportation. I assume other top OECD nations will create similar programs to encourage economic growth and activity within the data aggregation and analytics space.

From well-coordinated government policies to market forces to increased startup activity around cultivated data, these trends and developments are a harbinger that this space will be one of the major gold rushes for startups and venture capital over the coming years. Data is truly the future, and the time to stake claims to mine it for insights and prosperity is now.