As companies create machine learning models, the operations team needs to ensure the data used for the model is of sufficient quality, a process that can be time-consuming. Bigeye (formerly Toro), an early-stage startup is helping by automating data quality.
Today the company announced a $17 million Series A led Sequoia Capital with participation from existing investor Costanoa Ventures. That brings the total raised to $21 million with the $4 million seed, the startup raised last May.
When we spoke to Bigeye CEO and co-founder Kyle Kirwan last May, he said the seed round was going to be focused on hiring a team — they are 11 now — and building more automation into the product, and he says they have achieved that goal.
“The product can now automatically tell users what data quality metrics they should collect from their data, so they can point us at a table in Snowflake or Amazon Redshift or whatever and we can analyze that table and recommend the metrics that they should collect from it to monitor the data quality — and we also automated the alerting,” Kirwan explained.
He says that the company is focusing on data operations issues when it comes to inputs to the model, such as the table isn’t updating when it’s supposed to, it’s missing rows or there are duplicate entries. They can automate alerts to those kinds of issues and speed up the process of getting model data ready for training and production.
Bogomil Balkansky, the partner at Sequoia who is leading today’s investment, sees the company attacking an important part of the machine learning pipeline. “Having spearheaded the data quality team at Uber, Kyle and Egor have a clear vision to provide always-on insight into the quality of data to all businesses,” Balkansky said in a statement.
As the founding team begins building the company, Kirwan says that building a diverse team is a key goal for them and something they are keenly aware of.
“It’s easy to hire a lot of other people that fit a certain mold, and we want to be really careful that we’re doing the extra work to [understand that just because] it’s easy to source people within our network, we need to push and make sure that we’re hiring a team that has different backgrounds and different viewpoints and different types of people on it because that’s how we’re going to build the strongest team,” he said.
Bigeye offers on-prem and SaaS solutions, and while it’s working with paying customers like Instacart, Crux Informatics and Lambda School, the product won’t be generally available until later in the year.