• Greplin: 1.5 Billion Documents Indexed, Six Engineers

    Wednesday, April 27th, 2011

    J. Michael Arrington (born March 13, 1970 in Huntington Beach, California) is a serial entrepreneur and the founder of TechCrunch, a blog covering startups and technology news. Arrington attended Claremont McKenna College (BA Economics, 1992) and Stanford Law School (JD, 1995) and practiced as a corporate and securities lawyer at two law firms: O’Melveny & Myers and Wilson Sonsini Goodrich... → Learn More

    Late last year we first mentioned Y Combinator startup Greplin – it’s a startup that indexes your social stuff in the cloud, making all your Facebook, Gmail. LinkedIn, Google Calendar, Evernote, Twitter, Dropbox and just about everything else searchable. The easiest way to describe it is “the other half of search.”

    They opened their doors to customers in February. The company won’t talk about total user numbers yet, which isn’t surprising. But we have dug one interesting data point out of founder Daniel Gross – They’ve now indexed some 1.5 billion documents. And they’re indexing about 30 million new documents per day.

    What this means – when you join Greplin you authorize it to index various social apps and services. A typical user may sign up and start off by authorizing Greplin to index Facebook, Twitter and Gmail, for example. Greplin then grabs everything in those services – all your Facebook messages and updates, all your Twitter updates and DMs, all your Gmail messages back and forth, etc. , and lets you search them. When you add up all those documents for all users, you get to that big number, 1.5 billion.

    To put this into perspective, that’s about the size of Google’s web-wide index in 2001. Or 60 times the size of Google’s original 1998 index of 25 million documents.

    On the daily side, Greplin’s 30 million new documents a day is about 25% of Twitter’s current load (and Twitter gets off easy with 140 character documents). It’s not an apples to apples comparison, but it gives you some idea of the scale that they’re already reaching. And remember, they launched in February.

    And all that with just six engineers and one support person, says Gross. He has Amazon web services to thank for that, although the recent outage didn’t make him too happy.

    Company: Greplin
    Website: greplin.com
    Funding: $4.72M

    Greplin, launched in September 2010, is a personal web search engine that indexes information stored in online services.

    Learn more

    Tags:

    Sponsored Ads

    Sponsored Ads

    Sponsored Ads

    Upcoming Events

    Disrupt SF 2012

    San Francisco, CA