Mozilla Stealth Data Project Could Be Just What The Internet Needs

One of the most frustrating tasks about my job is finding reliable traffic and other usage data about websites.

But today, Mozilla CEO John Lilly and VP Engineering Mike Schroepfer said they may fix that problem in the future, via the massive installed base of Firefox users.

The State of Analytics Today

There are three ways to measure web traffic.

The first is user-focused and based on software installed on user machines. Services like Alexa and Compete get users to install software on their computers and then track surfing habits to come up with best guesses on Internet-wide traffic. It works in theory, but getting enough users to get statistically relevant results has proven challenging. Alexa is famously flawed, and while Compete seems to be somewhat better, it only tracks U.S. users. Comscore is another user-focused metrics company that tends to work well for large sites, not well at all for newcomers (and it is very expensive to access their database).

A second way to determine site useage is to track traffic directly from websites. Quantcast combines user surveys with direct tracking on websites (when they can get it) to estimate traffic. Comscore also does this with certain sites.

The third way is to track surfing behaviors via records from ISPs. Hitwise uses this method to provide web analytics to clients.

None of these services are particularly accurate (as can be seen by the fact that they almost always disagree with eachother). The problem is simply gathering enough data from enough users to be able to draw a picture-perfect image of actual Internet usage. That’s why I’ve called for Google to offer users to make their Google Analytics data publicly available. Would many people do it? Just the ones that want us to trust the user numbers and page views they claim.

How Firefox Could Fix The Problem

The product is still very early, say Lilly and Schroepfer. In fact, it doesn’t have a project name within Mozilla – they simply refer to it as “Data.” But the idea is fairly straightforward. Ask Firefox’s 170 million (and growing) user base if they would like to opt in to anonymous data collection on their surfing habits. Then take that anonymized data and create very statistically relevant analytics reports for all websites.

Only a small percentage of those 170 million users would have to agree to be tracked (Lilly said 1% is more than enough) to get useful data. There are Firefox users in every country, and the distribution is fairly attractive for worldwide analytics tracking. Only 29% of Firefox users are in the U.S. 13% are in Germany, 6% in France, 4% in the UK, and so on. Firefox is now available in 50 different languages.

Of course, this would track only Firefox users, not IE, Safari, Opera and other browsers. And Firefox users as a group may have different surfing habits than the Internet as a whole. But as Firefox usage grows more mainstream, this will become less and less of a problem. Mozilla estimates that they now have 18% market share across all browsers.

If and when this launches, it would likely be the most reliable public traffic and usage data available. Let’s hope they do launch it, and soon. I’ll be the first to sign up.