Kaggle, a platform for predictive data modeling competitions, has raised $11 million in Series A financing led by Index Ventures and Khosla Ventures. SV Angel, Yuri Milner’s Start Fund, Stanford Management Company, which invests and manages Stanford University’s endowment and other financial assets, PayPal Founder Max Levchin; Google Chief Economist Hal Varian; and Applied Semantics’ Co-Founder and Factual Chief Executive Officer Gil Elbaz, all participated in the round as well. Neil Rimer, partner at Index Ventures, will join Kaggle’s board of directors, and Levchin has been named chairman of the company.
Kaggle’s platform for predictive modeling competitions helps companies, governments, and researchers identify solutions to some of the world’s hardest data problems by posting them as competitions to a community of more than 17,000 PhD-level data scientists located around the world.
The Kaggle community of data scientists comprises thousands of PhDs from quantitative fields such as computer science, statistics, econometrics, maths and physics. They come from over 100 countries and 200 universities. In addition to the prize money and data, they use Kaggle to meet, network and collaborate with experts from related fields. As Kaggle founder Anthony Goldbloom tells me, “we’re making big data science into a sport.”
Here’s how it works. Companies, and organizations can post large data sets to the platform, and ask scientists to solve a problem or question from the data. The thousands of data scientists who participate in Kaggle competitions then develop algorithms to solve these large-scale problems and submit iterations of their algorithms throughout each competition.
Kaggle actually maintains a real-time leaderboard of each competition’s standings, so competitors are motivated to exceed the current benchmark until the competition closes. Once a competition ends, the sponsoring organization has a solution, and the field’s top entrants take home the competition prize. Thus far, data scientists from all over the world have submitted nearly 47,000 entries to various Kaggle competitions.
Kaggle says the results have actually led to new data discoveries and breakthroughs across many industries. For example, a competition for NASA, the Royal Astronomical Society, and the European Space Agency identified new ways to map dark matter in the universe, while another competition helped better determine the likelihood that the health of a HIV patient would improve or deteriorate.
Another example was showcased by insurance company Allstate, which ran a Claim Prediction Challenge and wanted to determine which motor vehicles were more likely to end up in a car accident from their subset of users. Allstate provided two years of data on the cars insured by the company for scientists to run.
Kaggle is currently hosting the $3 million Heritage Health Prize, the largest medical prize ever, designed to help reduce billions of dollars in unnecessary hospitalizations.
Google’s Varian says this of Kaggle: “Kaggle is a way to organize the brainpower of the world’s most talented data scientists and make it accessible to organizations of every size. By structuring incentives to create a competitive environment, Kaggle drives data scientists to produce better results than they would if they were working alone.”
Of course, many companies and firms may not want to upload classified and sensitive data to a public platform. Kaggle offers private competitions for organizations working with sensitive data or intellectual property. In private competitions, data is shared with a carefully selected group of Kaggle scientists who are held to a non-disclosure agreement, have been subject to a background check, and who have performed extremely well in previous Kaggle competitions. And every competitor who participates in the competition is awarded prize money based on his or her performance.
“Kaggle is working on one of the most exciting opportunities in big data analytics that I’ve seen in the last twenty years,” said Vinod Khosla, founder and partner, Khosla Ventures. “Kaggle’s platform has the potential to change the way we tackle data analysis problems.”
Kaggle says the new funding will be used towards hiring (the company has just one developer currently) and for sales and marketing efforts.
Kaggle is a platform for predictive modeling and analytics competitions. Companies and researchers post their data. Statisticians and data miners from all over the world compete to produce the best models. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know at the outset which technique or analyst will be most effective. How a Kaggle competition works: The competition host prepares the data and...