Google’s New Street View Image Recognition Algorithm Can Beat Most CAPTCHAs

Here is an interesting conundrum for Google: it has created an algorithm that’s significantly better at reading street numbers in Street View images, which helps it give you more accurate directions. At the same time, though, it turns out that this algorithm is so good, it can decipher 99 percent of CAPTCHAs (those squiggly text puzzles you often have to solve to prove you are human).

Google’s new algorithm for detecting street numbers can accurately detect and read difficult numbers in Street View 90 percent of the time, Google disclosed today. According to a joint paper by Google’s Street View and reCAPTCHA teams (PDF), recognizing this kind of data in natural photographs is a pretty hard problem. Just think of all the variations in lighting and issues with motion and focus blur. At the same time, it’s also essential for Google’s mapping efforts to get this data from these images.

2014-04-16_1044The standard approach is to separate out the localization, segmentation and recognition steps, but Google’s new approach unifies all of these steps and uses a “deep convolutional neural network” — a kind of neural network that’s especially effective for image recognition. Using Google’s publicly available Street View House Numbers dataset, the algorithm is about 96 percent accurate. On a per-digit basis, it’s 97.84 percent accurate. The regular Street View imagery is a bit more challenging, which explains why it is “only” 90 percent accurate on that data.

So far, Google says the system has helped it to extract close to 100 million street numbers worldwide.

To test the algorithm, Google also let it loose on its own reCAPTCHA puzzles. There, it is 99.8 percent accurate on the hardest reCAPTCHA puzzles. Given that the whole idea of CAPTCHAs is that they are too hard for computers to solve, that’s a pretty stunning number and the accuracy is likely better than that of most humans (at least I know I don’t get anywhere close to 99.8 percent accuracy when I try to solve CAPTCHAs…).


That’s obviously a problem for reCAPTCHA because developers who are less interested in the science behind this could exploit this to spam blog comments, for example. Google, however, says that its CAPTCHA system is now less dependent on deciphering the distorted text than ever before. Instead, reCAPTCHA now looks at a broader range of clues. Entering the text is just one clue, but Google now looks at it as “a medium of engagement to elicit a broad range of cues that characterize humans and bots.”