The attributes that define the increasingly critical Data-as-a-Service industry

It’s now common in tech to describe data as “the new oil or electricity” — a fuel that will power innovation and company growth for the foreseeable future. However, data is far from a novel industry. In fact, it’s a decades-old market, and many successful data companies, such as Bloomberg, LiveRamp (now Axciom), Oracle Data Cloud and Nielsen have been built in the past and serve as industry leaders… for the time being.

Still, a few characteristics separate today’s world of data businesses from those in the past. The market for data is increasing in size at a rapid rate, mostly due to new methods of measurement (like mobile phones, IoT sensors and satellite imagery) that generate new forms of information, as well as new, prevalent use cases like AI, which rely on huge quantities of high-quality data to work (emphasis on the high-quality).

These changes have led to unprecedented demand for data outside of traditionally data-hungry markets, like finance, marketing and real estate. They have also led to an iteration of data company that’s being classified as Data-as-a-Service (DaaS) — companies like Datanyze (acquired by ZoomInfo), Safegraph, Clearbit, PredictHQ and DataFox. DaaS stresses higher velocity, higher-quality, near real-time data that can support more rigorous needs, such as training machine learning algorithms. Non-financial corporations are more than happy to ingest external data that will help them streamline their operations, supply chain and marketing.

In this evolving world of Data as a Service, there are a few attributes that lead to a successful company:

DaaS must serve a big enough market. This seems like an obvious point, but too many entrepreneurs assume they can easily sell large volumes of high-quality data. Even though data is in higher demand than ever, the ability to use it and integrate it into general customer workflows has not been democratized. Music downloads and charts, for example, is valuable data, but the customer segment is not large enough at this point and a few players dominate the market. Social media or influencer ranking data, like Klout, is similar. There are many categories of real-time data that does not have the size or impact necessary to sustain a large-scale DaaS business.

DaaS is not about disruption, it’s about empowerment. Many startups want to “disrupt” a space, but DaaS companies need to focus on integrating into existing workflows rather than demanding customers change how they do business. This requires deep customer knowledge, easy integration and data that immediately provide value to the business. Potential customers have seen the buzz around Big Data, Hadoop and business intelligence, but the only thing they talk about is dashboard fatigue. It’s important that DaaS companies focus on seamless integration and solving a well-defined customer problem.

DaaS should have increasing incremental margin. Data businesses often have significant COGS, particularly at small scale. However, as a data business gets larger, the gross margins can improve dramatically. So it’s really important to understand whether the cost of acquiring or generating data changes as you bring on new customers. I call this incremental margin; the change in the difference between cost of producing data and how much that data can be sold for. If your gross margin is significantly higher for your fiftieth customer than it was for your first, then you are on your way to building a venture-backable business (or, if the margin is high enough, you may not even need VC backing at all). This increasing margin is a key pillar in building out a large, sustainable DaaS company.

It is data quality, velocity and margins that will decide whether or not a startup is successful in the long run.

DaaS must be machine readable. Today, data accuracy is increasingly powering company innovation, and quality becomes more important as data is used for AI training purposes. If a company is using data for something like a marketing campaign, it’s not critical if the data is of poor quality. Moreover, people have accepted a rock-bottom level to date — often 80 percent of marketing data may be erroneous. However, when data is being used to power AI applications and machine learning algorithms, low data quality could be disastrous. In other words, DaaS must be machine readable. Some data may need to be cleaned up; Trifacta is an example of a company that provides the tools to ensure higher-quality data. Other companies, such as Crowdflower (now Figure Eight), Mighty AI and Samasource label data and clean it up for algorithmic use.

DaaS must have continuous movement. In other words, there should be continuous value in data getting refreshed. A successful DaaS company does not provide data to serve a one-time use case; rather, the data should have a combination of velocity (change over time; days or hours) and inherent value in knowing the changes that are occurring. The higher the data velocity, the more value potential exists within that company’s data. Real estate or stock market data are good examples of value increasing with greater velocity.

DaaS must tell a story. Numbers are no longer enough. DaaS companies must provide the tools and analytics or AI to unlock data, identify trends and then provide context around those trends. AI is particularly useful in finding correlations across data sets that humans would never know to look for. Safegraph, which produces granular location data, provides us with some great examples of this. Location data is far more than the sum of its parts when it contains enough velocity and accuracy. For example, when paired with ZIP Code-based income data, location data can tell us quite a bit about food deserts and their disproportionate effect on poorer households that have to travel three times farther to get to a grocery store. Or, location data can tell us about the vast differences in travel patterns across different cities — information that is critical in the development of autonomous vehicles, where different vehicle types and considerations will be necessary for different use cases.

The above attributes are ones that differentiate DaaS businesses from more traditional data companies. Startups looking to build sustainable, high-growth companies should heed these critical elements. As the need for AI-enhanced products grows, DaaS will only grow with it — but it is data quality, velocity and margins that will decide whether or not a startup is successful in the long run. As demand for DaaS increases, I expect we’ll also see an entire industry of data marketplaces and data cleaning products and services built around it.