web application security lab

reCAPTCHA Image Processing To Stop Bots

A few weeks ago Ben Maurer posted a link to a service called reCAPTCHA that attempts to solve the spam problem in the typical CAPTCHA way while solving another hard problem at the same time. reCAPTCHA is a part of a project to scan old books. But part of the problem with scanning using OCR is that you get crap results sometimes. Therein lies the reCAPTCHA idea - replaying that odd looking text to users and getting them to type the answers in, next to a real CAPTCHA. Knowing that one is correct assumes that the OCR image is valid.

So the next question you have is that what if someone doesn’t answer the second question at all or puts in something erroneous - that’s okay it uses a voting system to make sure more than one person agrees (I’m not sure on the specifics of the voting system). That makes for a pretty interesting system in a lot of ways. However, one comment made by “Anonymous” on Ben’s site caught my eye.

“Chinese radio scare alert: these people want to exploit your brainpower with their captcha tricks! It’s like enslaving humanity, one word at a time!”

You certainly get extra points for originality of your idea.

I’m sure nobody will get my Chinese radio reference though…

What Anonymous is referring to is the Chinese Lottery. It’s a theory in cryptography where you can force many people to do very small tasks to get the answer to a bigger problem (in the lottery example force the government supplied radios to perform small parts of a very large crypto problem). For instance, if they can somehow ask users to perform a math function that is somehow more efficient for a user to do than a computer, then it makes sense. There is another similar theory using biochemical reactions in a DESasour, where each cell of an organism combines to perform a computationally complex task, but given the volume of cells in any sizable creature, it would have enormous computing power.

Granted reCAPTCHA is terrible at this - it is far more efficient to perform any mathematical task with a computer than anything a human could do. The only way I could see this being used in a nefarious way, other than the CAPTCHA proxy idea is if part of what a government needed to do was OCR classified documents (this could be even more effective in other languages where translation services are at a premium). While possible, it sounds like quite a conspiracy theory to me. But Anonymous can rest assured that someone out there understood his reference! ;)

4 Responses to “reCAPTCHA Image Processing To Stop Bots”

  1. Legionnaire Says:

    Xm, it’s interesting in a “hey look at that!” way.

    I mean the CAPTCHA part isn’t something special and I bet current CAPTCHA-solving bots will be able to do a much better job at this therefore proving the security part useless, yet providing huge aid to the book digitizing project.

    P.S.: No conspiracy here :P

  2. Ben Maurer Says:

    Actually, CAPTCHA solving bots can (for the most part) only solve simple CAPTCHAs like the ones used in many open source packages (see Well designed CAPTCHAs, such as those used by Google, Microsoft, Yahoo, etc are most likely not breakable by current software at any rate that makes that attack useful. While only time will tell, we believe that reCAPTCHA has security that is equivalent to the commonly used CAPTCHAs

  3. Ronald van den Heetkamp Says:

    Well it actually is happening, take a look at this RSnake: http://www.0×

    I guess you set a trend :)

  4. MustLive Says:

    Guys, as I wrote at my site, reCaptcha is not so reliable protection.

    MoBiC-04: reCaptcha CAPTCHA bypass