Lies, Damned Lies, And Statistics or How To Get Under John Borthwick's Skin

There are lies, damned lies, and statistics, as Mark Twain once said. A couple days ago, I wrote a post titled, “What Happened To bit.ly’s Market Share” after I noticed some new statistics on TweetMeme which suggested that the market share for short URLs has shifted in the past few months and is actually diversifying as more and more short URLs inundate the Web.

John Borthwick, the investor who incubated bit.ly and then spun it off from betaworks, didn’t like that headline because it called into question bit.ly’s continued dominance. He also didn’t like it because there was a problem with the underlying statistics. Previously, the TweetMeme stats showed only the top 5 URL shortening services in a given 24-hour period. But then TweetMeme took down the stats for a couple months while it reworked the underlying architecture to better scale with the incredible growth in these kinds of links. When the stats quietly came back over the holidays, they looked different. Instead of bit.ly showing a 70 to 80 percent share of shortened links on Twitter, it only had 56 percent (today it’s at 58 percent).

One reason for the change was that TweetMeme was counting differently. It now included “other” as a category, whereas before it only showed the relative share of the top five players. Indeed, if you look at relative share, bit.ly is still in the mid-70s. Borthwick pointed this out to me privately via email and I corrected the post. It was something that I missed, but I wasn’t the only one who missed it. Borthwick and Andrew Cohen at bit.ly missed it when I ran the numbers by them prior to posting, and even TweetMeme’s Nick Halstead didn’t catch it. In fact, he told me the data was comparable.

I added the correction but didn’t change the headline because it was still a valid question. The numbers had changed.  Why?   Borthwick still wasn’t happy, so he wrote his own post this morning with a deliberately misleading headline (“charting the real time web OR the curious tale of how TechCrunch traffic inexplicably fell off a cliff in December”) to make his displeasure known. Duly noted. Of course, the headline got the post on Techmeme even though you have to get halfway through the post to find out “I actually don’t have any data to suggest that happened.” Borthwick also offered some of bit.ly’s own data suggesting that it still has a 68.6 percent share of total short links on Twitter (see his table below).

Now 68 percent does sound better than 58 percent., and it’s pretty darn close to the 70% bit.ly constantly cites as its market share. But here’s the thing. Borthwick’s data is based on something known as the Twitter “garden hose.” It is a trickle of data that is a sample of the Tweets going through the service.  You can see that by looking at the number of occurrences for each short URL: 4,193 for bit.ly, 6,112 for all of them.  TweetMeme’s stats are based on a much bigger set of data: the so-called “firehose.”  After filtering for only Tweets with links in them, TweetMeme’s stats are based on more than 3 million Tweets a day.  I think I’ll go with TweetMeme’s numbers, but God bless Borthwick for trying to put his company in the best light.

Yes, the numbers changed. But now we know that bit.ly’s market share was never 80 percent to begin with.  That’s not to say that bit.ly is not growing like gangbusters. It is—bit.ly went from shortening 12 million links to more than 2 billon in a year.  But so is the rest of the market, which is diversifying and fragmenting as new short-link domains inundate the Web, including ones which are not general-purpose link shorteners but rather tied to specific sites or apps, such as goo.gl or wp.me (WordPress). We can argue about statistics all we want. The more interesting question is can bit.ly continue to dominate? For the record, I actually think they have a good shot.