Yesterday, Twitter analytics lead, Kevin Weil, gave a talk at O’Reilly’s Strata 2011, a conference dedicated to big data. The main topic of the talk was Rainbird, Twitter’s realtime counting system that’s built on top of Cassandra. Notably, it powers a number of things Twitter uses internally, such as Promoted Products analytics, operational monitoring, and even Tweet Button counting. Today, Twitter has posted the entire presentation to SlideShare, which means we can now embed it above.
It’s fairly technical, but also pretty easy to follow along with. If you’re at all interested in how Twitter acquires, stores, and uses the massive amount of data they deal with, you should check it out. It also gives a glimpse into their Promoted Tweet analytics package (which looks quite nice).
Most importantly, you’ll probably want to know about it because Twitter plans to open source it. But first, they have to wait for the version of Cassandra they’re using to be official released, and for some of their own internal stuff to be open sourced. But Weil promises that it will happen.
Created in 2006, Twitter is a global real-time communications platform with 400 million monthly visitors to twitter.com, more than 200 million monthly active users around the world. We see a billion tweets every 2.5 days on every conceivable topic. World leaders, major athletes, star performers, news organizations and entertainment outlets are among the millions of active Twitter accounts through which users can truly get the pulse of the planet.