Google Acquires reCaptcha To Power Scanning For Google Books And Google News

Leena Rao

Leena Rao is currently a Senior Editor for TechCrunch. She recently finished graduate school at the Medill School of Journalism at Northwestern University, where she studied business journalism and videography. From 2004 to 2007, she helped lead Congresswoman Carloyn Maloney’s community outreach and relations efforts in New York City. She graduated from Columbia University in 2003, where she was... → Learn More

Wednesday, September 16th, 2009

Google has acquired reCAPTCHA, an open source technology that provides CAPTCHAs to prevent spam and fraud. Captchas are those security questions you find on Web sites that require you to decipher and type words or numbers and detects whether the user is a human.

Here’s what Google wrote in a blog post about the announcement:

CAPTCHAs are designed to allow humans in but prevent malicious programs from scalping tickets or obtain millions of email accounts for spamming. But there’s a twist — the words in many of the CAPTCHAs provided by reCAPTCHA come from scanned archival newspapers and old books. Computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text.

Google says that reCAPTCHA’s technology improves the process that converts scanned images into plain text, known as Optical Character Recognition (OCR). It sounds like Google will be using the technology to power massive scanning projects for Google Books and Google News Archive Search as well as for fraud and spam prevention.

In May, the New York Times reported that Google was developing their own type of captcha and also took notice of the potential of reCAPTCHA’s technology. Sounds like Google found it more effective to acquire reCAPTCHA’s technology instead of reinventing the wheel.

Tags:
blog comments powered by Disqus