Google Gives Publishers More Control Over How It Crawls Their News

Yesterday, Google threw complaining publishers a bone with its First Click Free program, which lets news sites limit the number of free clicks from Google News for any individual to five a day. News sites have long been accusing Google of profiting off of their news with Google News but today Google is making another concession to publishers.

Google is launching a new crawler that will let publishers keep their content out of Google News and still remain in Google Search. Publishers have always been able to do this by filling out a contact form but now Google is making it easier by automating the technology with a news-crawler.

Currently, publishers can block Google from including their content in Google’s main index via a Robots Exclusion Protocol (or REP). When Google’s crawler arrives at any site, it checks to see if there’s a robots.txt file to make sure the search engine has permission to crawl the site. This gives publishers the option to block an entire site or certain sections or pages.

Google is applying this technology specifically to News. News sites can now block images from thew news crawlers but not from general web search. Or on the flipside, a news site could choose to index their content on Google News but not Google’s main search index. If a publisher decides to opt out of Google News but still wants to be indexed on Google Search, content will show up in search results, but won’t appear in the block of news results that sometimes shows up in Web Search.

Publishers stand to gain from indexing on Google news, with News sending publishers about 1 billion clicks every month (and 4 billion clicks per month from all of Google). Josh Cohen, Senior Product Manager For Google News writes:

Each of those clicks is an opportunity for publishers, allowing them to show ads, sell subscriptions and introduce readers to the great content they produce every day. While we think this offers a tremendous opportunity for any publisher who wants new readers, publishers are the ones who create the content and they’re in control of it. If they decide they don’t want to be in Google, it’s easy to do.

The new crawler gives publishers like Rupert Murdoch even more ways to block their news from Google. It’s a Christmas wish come true.