Investors flock to fund an AI cornerstone: Feature stores

“Feature stores,” with their dreary and opaque moniker, might not sound like the sexiest subject.

But they’re an essential part of the AI systems that enterprises — and consumers, for that matter — use every day. That’s why they’re attracting an increasing amount of attention and investment from venture firms, which see the market opportunity growing into the distant future.

AI systems are made up of many components, one of which is features. Features are the individual variables that act like inputs in the system. In thinking about features, it can be helpful to visualize a table, where the data used by AI systems is organized into rows of examples (data from which the system learns to make predictions) and columns of attributes (data describing those examples). Features are attributes used to describe each example — an AI spam detector tool might use features like words in the email body, for example, or a sender’s contact information.

Working with features tends to be an ad hoc process within a single AI system. But at the enterprise scale, where data science teams are responsible for maintaining dozens to thousands of systems, a place to manage and track features becomes a necessity.

Enter the feature store, a centralized repository for organizing, storing and serving the features that AI systems rely on. Introduced as a concept by Uber in 2017, feature stores provide a unified place to build and share features across different teams in an organization.

“Feature stores sit at the intersection of data and machine learning,” Michael Del Balso, the CEO of Tecton.ai, a startup developing feature store software for businesses, told TechCrunch in an email. “[Feature stores are] an essential part of the ‘MLOps’ stack because they enable data teams to quickly, reliably build high-quality features using real-time data and serve those features in production for real-time inference. They serve as the interface between data and [AI] models.”

Going beyond simply a database, feature stores allow data engineers to see statistics on features, including which features have been used, where they’ve been used and the impact they’ve had on models. Feature stores also transform data, allowing users to aggregate, filter and join features without necessarily needing to code. (Think aggregating orders at a restaurant to get the feature value “number of orders over the past 30 minutes.”)

Del Balso explained: “Advanced feature stores … automate production pipelines to collect data from batch data sources and real-time sources, transform the data in real time, and store the data in the offline and online store. [They often also] include built-in monitoring capabilities to monitor pipeline health, data drift, service levels and more.”

Image Credits: Tecton.ai

Feature stores promise to enhance collaboration between teams while streamlining the development of AI systems. As the demand for them grows, tech giants and startups like Tecton are developing products to meet the need — and investors are backing them enthusiastically.

Since its founding in 2019, Tecton has raised $60 million from venture firms including Andreessen Horowitz and Sequoia Capital to build out its feature store platform. Rasgo and Molecula, two competitors, have each snagged about $20 million in venture capital.

Google and AWS now offer feature store solutions, too. So do later-stage AI development platform startups such as Databricks and Splice.

“Investors have seen companies across various industries build new products and revenue streams via machine learning and AI. They understand the need for solutions that enable this trend for the masses,” Jared Parker, the co-founder and CEO of Rasgo, told TechCrunch via email. “Feature stores provide an accelerated path to getting models into production while also enabling centralization, collaboration and governance of the data pipelines that drive machine learning.”

Supercharging the feature stores segment is the broader boom in MLOps, the set of practices for deploying AI systems in production reliably and efficiently. According to Cognilytica, the global market for MLOps platforms — which can include feature store products and services — will be worth $4 billion by 2025, up from $350 million in 2019.

As Del Balso explained, the upticks signal the enterprise’s embrace of “operational” machine learning after years of struggling to deploy AI systems into production — and maintain them afterward. According to a 2020 Forrester study, 41% of companies say they’ve experienced challenges operationalizing any machine learning models and “lack the process to do so.” One 2019 white paper suggested that about 85% of AI projects fail eventually — even when companies prioritize AI and machine learning over other IT initiatives.

Rasgo

Image Credits: Rasgo

“Operational machine learning consists of running machine learning in production to power customer-facing applications and to automate business processes,” Del Balso said. “Operational machine learning can be used to support a very broad range of use cases, including fraud detection, real-time pricing, product recommendations, risk assessment and more. It’s the future, and enterprises that become good at building and deploying operational machine learning will acquire a significant competitive advantage.”

So what does that mean for the future of feature stores? Flush with capital and interested customers, Del Balso sees an opportunity to take advantage of the cloud to make features based on public data — e.g., weather and stock prices — broadly accessible. He also expects automation will play a larger role in feature stores, helping suggest the reuse of existing features and perhaps even creating new features as needed.

His predictions align with those in Deloitte’s Tech Trends 2021 report. The firm writes that feature stores may eventually be able to predict the demand for certain features based on the types of data being modeled, potentially improving the performance of AI systems by reducing the delay in serving features.

“In an ideal world, a data scientist would simply say, ‘This is what I’m trying to predict,’ and the feature store would suggest (or build) the most predictive features for that use case,” Del Balso said.

From Parker’s perspective, data transformation and feature management are the frontiers the industry must master to capture the entire value of AI — and feature stores are the way to achieve this.

“A model is only as good as its data, and too many projects never make it to production because of the sheer time it takes to prepare data for machine learning. … The market is struggling with this problem and understands the massive competitive advantages of more agility and speed within the data science function,” Parker said. “By accelerating [the] process via the use of feature stores, we can get more models into production and finally achieve the promise of AI.”