Digg.com and other links from its website have disappeared from Google’s search result pages, following a recent update to Google’s algorithms. It’s unclear at this time exactly what caused the site to be de-listed, though the current speculation is that it has to do with Digg pointing to bad or spammy links. [UPDATE: See below, Google says it will fix the problem].
You can see the issue right now if you Google the keyword “digg” – the site is no longer the top search result. Instead the top results are the Digg Wikipedia page, its Twitter account, and other links that are highly ranked for the word “digg.”
In addition, if you Google “site:digg.com,” Google will display a message saying no results are found.
Some conspiracy theorists out there are even pointing out that the de-listing came shortly after Digg’s announcement that it would build a replacement to Google Reader, but that’s more than a bit crazy. For Google to drop the site from its index, it would have to be a technical error of some sort, or something to do with the links Digg is hosting.
As one person on Hacker News points out:
Doing a site:digg.com/news/ search on Bing shows a lot of pages like these: http://digg.com/news/gaming/ing_bank_i_ilanlar and even more duplicate tag and rss pages for “site:digg.com/tag/” and “site:digg.com rss”.
These /news/ pages 302 redirect to many different sites (some are bound to contain spam or be of lower quality).
302 redirects for these links is bad practice. Some link shorteners (ab)use 302 Found (instead of 301 Moved Permanently) to hoard content that doesn’t belong to them. The content for these links can’t be found on digg.com, so they too use the wrong redirect and associate themselves with all pages they link to.
Besides that: Digg.com acts like a single page webapp for most of its content. There are no discussion pages or detail pages for the stories. The content that does appear is near duplicate to other content on the web, especially with popular stories, where many blogs just copy the title and the first intro paragraph.
Another user on Hacker News noted that Digg’s robots.txt file now reads: User-agent: * Disallow, which would imply an issue on Digg’s part, not Google’s.
However, Digg GM Jake Levine tells us that he suspects the issue might have to do with the links the Digg is hosting from the old Digg.
“We’re chatting with Google to work this out, but my guess is that it has something to do with the links we’re hosting from the old Digg. When we acquired Digg we inherited tens of thousands of links to old Digg submissions, some of which ranked well in search,” Levine explains. “We decided the right thing to do was to redirect all of those links to the original source URLs. This may have tripped up Google’s index. I’m sure we’ll be able to get this sorted out shortly.”
“The good news is that it doesn’t really impact us all that much,” he adds. “The vast majority of our traffic is direct (like 90%+) so it’s not a huge deal for us from a business/user perspective.”
Digg is still waiting for Google to respond at this point, so we’ll know more soon.
Despite Levine’s assurance that most of Digg’s traffic is direct, and the Google de-listing won’t affect the site, I’ve seen first-hand the effects of a change like this can do when a big-name site is involved. Several years ago, when I worked at ReadWriteWeb (now ReadWrite), an article on the site became the top hit for Facebook in Google. Since many people don’t actually type a URL (they just Google a site name), thousands of confused users fled to the comments complaining they couldn’t log in and what has happened to Facebook!? Years later, we’re still getting a good laugh about this.
UPDATE: Google has provided an official response:
We’re sorry about the inconvenience this morning to people trying to search for Digg. In the process of removing a spammy link on Digg.com, we inadvertently applied the webspam action to the whole site. We’re correcting this, and the fix should be deployed shortly.
UPDATE, 2:20 PM ET: Digg has now returned.