Clusters are the way to go. Google and Yahoo run their Websites on distributed databases spread across vast clusters of servers. Now Aster Data Systems, a startup that is coming out of stealth mode today, is offering a clustered database for Web analytics to any large website. One of its first big customers is MySpace, which is running the database on a cluster of 100 server nodes to analyze what songs and videos are going viral, what features are becoming popular, and what content is being consumed on its service. That comes to more than one terabyte of new data every day that needs to be analyzed. CEO and co-founder Mayank Bawa explains:
Google and Yahoo had to build this infrastructure for themselves. Others don’t have this. So we will give them a very scalable database, and keep costs low by running it on commodity hardware.
Bawa and his co-founders were Ph.D students at Stanford when they founded the company in July, 2005. They raised an angel round of about $1 million in November, 2005 from Stanford computer science professor David Cheriton, Josh Kopelman at First Round Capital, Anand Rajaraman (founder of Junglee and Kosmix), and uber-angel Ron Conway. Cheriton was also one of the early angel investors in Google. Another Google investor, Sequoia Capital, took the entire A round in May, 2007. The company is not disclosing that round, but it is believed to be around $5 million.
Kopelman explains why he invested:
AsterData give companies deep insights on massive data by transforming off-the-shelf, commodity hardware into a powerful, self-managing, and scalable analytic database. Data analysis that previously took days to run (or were impossible to run) now routinely finish in minutes/hours. They already have paying customers today and Aster is in production managing billions of events per day.
It’s a data driven world, and the old-style databases just can’t keep up.