Google: Spam Really Has Increased Lately. We're Fixing That, And Content Farms Are Next

Next Story

PeopleRank: Quora Is Developing An Algorithm To Determine And Rank User Quality

Over the last month, you may have seen some of the reports that Google’s search results are overloaded with spam. This isn’t a new phenomenon (for years now I’ve been tearing my hair out whenever I try to find a manufacturer instruction manual online), but people are noticing that it’s getting worse. Fortunately, Google seems to be listening.

Today Matt Cutts, who heads Google’s search quality team, has written a blog post stating that there has indeed been a “slight uptick of spam in recent months”, and he details what Google is doing to fix it.

First Cutts goes into some of the tweaks Google is making to its algorithms to specifically address the recent increase in spammy results:

To respond to that challenge, we recently launched a redesigned document-level classifier that makes it harder for spammy on-page content to rank highly. The new classifier is better at detecting spam on individual web pages, e.g., repeated spammy words—the sort of phrases you tend to see in junky, automated, self-promoting blog comments. We’ve also radically improved our ability to detect hacked sites, which were a major source of spam in 2010.

But Google isn’t going to stop there. Now, finally, it sounds like they’re going to do more to take on sites that just repurpose content from other sites (hopefully including the countless sites that repost TechCrunch articles verbatim):

And we’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content.

The most interesting part of the blog post is Cutts’s discussion of so-called “content farms” — those sites that consist primarily of low quality content, typically produced specifically because it will rank well in search results. It’s not clear if this would impact ‘professional’ content farms (like Associated Content and Demand Media) or if it’s going for smaller-time outlets, but it could be a big deal.

As “pure webspam” has decreased over time, attention has shifted instead to “content farms,” which are sites with shallow or low-quality content. In 2010, we launched two major algorithmic changes focused on low-quality sites. Nonetheless, we hear the feedback from the web loud and clear: people are asking for even stronger action on content farms and sites that consist primarily of spammy or low-quality content. We take pride in Google search and strive to make each and every search perfect. The fact is that we’re not perfect, and combined with users’ skyrocketing expectations of Google, these imperfections get magnified in perception. However, we can and should do better.

Cutts does say a few things to defend Google’s search quality. For one, he says that English-language web spam is appearing in results less than half as often as it was five years ago. He also notes that, despite some theories to the contrary, Google will take action against spammy sites that feature Google ads (the theory goes that Google makes money from these ad-loaded content farms, so it isn’t incentivized to remove them).

blog comments powered by Disqus