Cenzic 232 Patent
Paid Advertising
web application security lab

Google “What’s Up” CAPTCHA

I don’t have time for a full blown Google rant today, but I was forward this link today and I couldn’t believe my eyes. This is Google’s “What’s Up” CAPTCHA. You know, when I first heard about it it was described to me as “a picture and you have to tell it which way is up”. So my first reaction was “that’s a terrible CAPTCHA - only one in four chance.” Well, it’s not that bad. If you actually read the paper it’s actually a 1/22 chance (assuming no optimizations).

There are other problems with this though - like the fact that it relies on a set of pictures and someone has to make a judgment call on what is the correct position. I bet it’s easier to solve for humans, but it’s also fairly trivial for robots to solve too. CAPTCHA - what does that mean anyway? Let’s see if Google’s project meets the definition:

Completely Automated - Google employees need to make judgment calls ahead of time on each image orientation, so this requirement of a true CAPTCHA fails and incidentally adds a hidden cost to using the “What’s up” CAPTCHA, although it might not be huge, if you make the set small (which would cause other problems).

Public - well, as public as anything Google does is public. It’s not open source or anything, but it’s out there.

Turing Test to tell Computers - I would argue that it’s not a Turing test at all, because if you have a set of 45 robots that try only one guess a piece Google’s “What’s up” will fail to catch two of them. And again - that’s with zero optimizations. Second major failure making this not actually a CAPTCHA.

and Humans Apart - I think it fails this one as well, since blind people are humans. So are non JavaScript/Flash/CSS wielding users - I know I’m human. So that’s three major failures of one definition alone. Not great!

Someone with far greater math skills than I will some day create the mathematical proof that explains why CAPTCHAs aren’t technically achievable. It’s possible to create tests that are vaguely good at telling computers and humans apart (CAPVGTCHAs perhaps?) but unless my understanding of the universe is way off base, I think CAPTCHAs are modern day perpetual motion machines. Everyone thinks they get it and it can work, but it’s never been done, and no one has come even close, in my mind. Sorry, I know this wasn’t as good a Google rant as I normally come up with, but as one of their guys over there recently told me, “You don’t call, you don’t rant…” I know… too busy!

29 Responses to “Google “What’s Up” CAPTCHA”

  1. Kyo Says:

    I’m not as pessimistic about it as you are. I think it’s a fairly clever idea, which works great in theory. As you point out there are flaws with the practicality, but if you have a big enough set of images and a low enough chance of success (bundling two images in one captcha, for example) it would definitely be an improvement over those text captchas which robots can solve better than humans. Of course it’s not the perfect solution, but neither are current practices and this is definitely a step in the right direction in both user friendliness and security.

  2. Me Says:

    This part of the paper is fairly important:

    The computer success rate of our CAPTCHA is based on two
    factors: the window of error we would allow people to make when
    rotating the image upright, and the number of images that they
    would need to rotate. For example, if we allowed users to rotate
    images in a 6-degree window (3į on either side of upright) the
    machine success rate would be 6/360. If users were required to
    rotate 3 images to their upright position, the computer success rate
    would be decreased to (6/360)3. A CAPTCHA system which
    displayed ≥ 3 images with a ≤ 16-degree error window would
    achieve a guess success rate of less than 1 in 10,000, a standard
    acceptable computer success rates for CAPTCHAs [8].

  3. Ross Dargan Says:

    Captchas are doomed to fail as there is a very cheap (human!)labour force willing to solve captcha’s for sub $4 per 1000 - far better off trying to detect automated processes (incl spamming) than trying to prevent the initial registration.

    just my two cents.

  4. RSnake Says:

    @Kyo - actually if you think about it, it’s worse. The time to enter in 4-6 digits is almost negligible (giving 10,000-100,000 bits of complexity respectively). We’re talking a few seconds. The amount of time it would take me to rotate 4-5 images (let’s say that’s comparable in complexity to make it easy) is at least as bad - especially if it’s not really clear to me what is “up” which is a big crux of the problem - I might have to really think about it. Also we are only talking about un-optimized. If you want to try to optimize the problem it ends up way worse.

    How you say? Okay, let’s talk about images in general. Most images are of things in nature or things in a human world. Both of which fit a certain standard set of rules. Things like lamp-posts have a lot of vertical lines in them. Things like sea-scapes have a lot of horizontal lines in them. Beds and cars are supposed to be at angles you say? Yes, that is where perspective detection comes into play - calculating vanishing points is relatively easy. Things like cups and rims on cars are ovals. Ovals on the long side represent a vertical or horizontal axis almost always. Those types of optimizations turn what sounds like a hard problem into a relatively easy one by comparison. They are wildly overestimating their own effectiveness against a well tuned program.

    My point is - just because you or even I don’t immediately see an easy way to defeat it doesn’t mean it’s not defeatable. It just means we haven’t spent enough time thinking about it. Incidentally I know some people who were working on exactly this problem to determine which way was up in the case where people were submitting images that were not yet properly rotated. It turns out detecting peaks of houses, lawns and fences is extremely easy. Google just doesn’t understand the problem, just like everyone else I’ve met who’s developed their own “solution” doesn’t understand the problem.

    The real crux of this problem is that I haven’t yet seen a CAPTCHA. Ever. Not by its definition anyway. Just because you say it’s something doesn’t mean it is.

  5. Andrew Says:

    There is no need for google employees to go through loads of images marking whether they are upright.

    As long as the images come from a source where the images are known to be upright the CAPTCHA algorithm can rotate the image as it pleases and no human intervention is required.

  6. Ben Maurer Says:

    This is a fairly interesting paper. However, it is extremely ironic that on the same day Google announced this, they also announced a similar images service (http://similar-images.googlelabs.com/). An attacker who can use such a service can greatly increase their accuracy — when they get an image, they search for similar images and either 1) label some small subset (eg, you only figure out which was is up on one tiger, then any tiger image can be broken) or 2) search for an image which can automatically be labeled (eg, look for a tiger image where there is sky + grass so up can be detected automatically)

    This of course assumes the image isn’t indexed by the search engine. If the CAPTCHA were to use publicly available images (it’d be difficult to get a large collection of images that aren’t) similar images could likely find the source image given the rotated version.

  7. Ben Maurer Says:

    @Ross Dargan:

    I have an apartment with 1) a lock on the door 2) an alarm system. I’m fairly sure that the lock can be picked. Besides, there are windows which somebody could just break. Somebody who’s really clever could probably figure out a way around the alarm system.

    Despite the imperfection of the security systems, I keep them there. They greatly raise the bar an attacker must meet to break into my apartment. Using multiple systems provides defence in depth.

    A CAPTCHA is very similar. Detecting spammy and abusive requests is a difficult problem. Websites employ many different techniques to mitigate spam, including CAPTCHAs. Even though it is possible to buy human typists, the CAPTCHA provides depth to security by 1) keeping script kiddies at bay, 2) requiring additional programming time to launch an attack, 3) only allowing spammers that generate enough revenue to justify paying for typists. 4) Rate limiting the spammers [especially when it’s night time in India].

  8. Ross Dargan Says:

    @Ben Maurer

    You are right - it does keep some spam off sites so it is worth it, but if I was to spend time focusing on reducing spam Id perhaps try to keep the time spent on implementing a captcha to a minimum and spend time creating methods to detect automated methods of spamming a forum, or spotting unsual posting patterns (like the same message multiple times, or a high post count over a very short period with a lot of $s in or something like that).

    Perhaps what we need to do is create an captcha thats so hard only a computer would be stupid enough to try it, then automatically fail it if it gets it correct lol.

  9. Ben Maurer Says:

    @Ross

    That’s why there’s reCAPTCHA: a CAPTCHA implementation takes only 5 minutes :-).

  10. RSnake Says:

    @Andrew - that is not true, unfortunately. There are millions, if not billions of photos out there that when cut up into a circle would not make a discernible picture to a human as to know which way was upright. Even if they could narrow it down to pictures of things in nature and the real world, there are still tons of pictures out there that aren’t obvious which orientation they are, because the subject matter isn’t useful for determining that information (birds in flight, underwater pictures, post modern sculptures, pictures of fabric, etc…, etc.

  11. Ben Maurer Says:

    @RSnake

    It doesn’t really matter if not all images have a clear “up” because in a multi-image CAPTCHA, a “test” value can be included that isn’t graded a-la-reCAPTCHA. One thing the paper doesn’t mention about this is that it increases changes of a bot randomly guessing (since if they use N captchas and allow for X degrees of error the random guess rate is (X/360)^(N-1))

    Anyway, this is an interesting concept, but like many academic CAPTCHA papers, it’s a far cry from a practical service. In addition to security issues, there are many usability issues (they probably used the HTML canvas code or flash to do rotation, which is a no-no for browser support), accessibility issues (how do you make sure you get images that color blind people understand), and other types of issues to understand.

  12. Scott Says:

    I do not think anyone at google has to predetermine the orientation. They have images.google.com, and they know that 99.99% of them are correctly oriented. They can even pull out just the faces.

    All they need to do is perform a rotation to the other non accurate images. I sort of like this idea.

    I hate captchas though, google are by far the worst, and near impossible for humans to do.

  13. Kyo Says:

    I’m not as pessimistic about it as you are. I think it’s a fairly clever idea, which works great in theory. As you point out there are flaws with the practicality, but if you have a big enough set of images and a low enough chance of success (bundling two images in one captcha, for example) it would definitely be an improvement over those text captchas which robots can solve better than humans. Of course it’s not the perfect solution, but neither are current practices and this is definitely a step in the right direction in both user friendliness and security.
    P.S. - Sorry, forgot to tell you great post!

  14. I Says:

    the easier the captcha is for a human to decode, the faster a “very cheap (human!)labour force” can solve them

    it’s a no win

  15. Kai Sellgren Says:

    As a response to earlier post talking about human CAPTCHA solvers.

    Do note that CAPTCHAs are supposed to separate human beings from computers. If someone pays dollars for a group of poor CAPTCHA solvers, then it is not CAPTCHA’s responsibility to detect it.

    If you want to also prevent that kinds of situations from happening, you would need to implement an additional scheme that does it. Let’s say we have a registration form that we want to protect from spamming. So, asking for a Credit Card will pretty much make sure your registered users are “unique” and not spam. However, asking for a CC in usual situations is a doomed solution. You probably need to ask something that requires good English and some basic maths, for instance. Then you can no longer have spam registrations generated by poor human beings from East. This all assuming that your CAPTCHA did its job (prevented computers, allowed all human beings).

    Google has money and a huge collection of data that it could use for creating CAPTCHAs. They could create a good CAPTCHA, but no CAPTCHA is perfect. If you implement enough human-like processing AI, it will pass all CAPTCHAs.

  16. Steve Nordquist Says:

    Which way is up in goatse; If you aim down you’re going up him. Would the automatic orientation method work on pix of the Kool-Aid massacre? This is fun-to-break; whether it’s CCD color-pattern edge inconsistency or JPEG block error minima (within a circle) I suppose the challenge is to break it within a small amount of Javascript.

    I think I’d prefer: Architectural Digest v. Apartment For Rent photos. Mall Finials v. Venician Masters. Donna Karan or Ferragamo? Tool or tuille? Dead or in chemotherapy? Colloq gathering honoring of U. Chicago or the Palm Pre? Que es mas macho?

  17. DeadFish Says:

    Almost all human logic can be imitated by a computer. I personally couldn’t think of any that could not. CAPTCHAS require logic to solve, there is no reason why computers cannot solve them that differentiate them from human. Complexity issue of the logic might deter people from solving it using a computer. And at the end of the day, I think deterrence is main idea of CAPTCHAs. Just my 2 cents.

  18. riosatiy Says:

    What about the possibility of people creating huge databases with “Whats Up” Images in their up-right condition. And by using that way creating a robot that can get it right?

    I might underestimate the great amount of different pictures they would use?

  19. Michael Says:

    > Itís possible to create tests that are vaguely good at telling
    > computers and humans apart

    What are you’re waffling about at the end?
    There’s no machine created today that would even get close to passing for a human. It’s completely trivial to tell machines and humans apart.

    Tests can be created that make it a complete no brainer, let alone vague.

    That said, you’d probably be right had your argument been to criticise the acronym given to these website tools as being flawed. But despite that, that doesn’t really make Google’s new captchas good or bad.

    Obviously the purpose of these tools is not really anything to do with Turing tests and Alan Turing’s work. The tools exist simply to stop people from abusing internet services. Since the abuse has, in part, come from automated tools, the derived name of the acronym is easy to see, but, taking that name too literally is your flaw.

    It seems self evident that no tool is going to be a complete solution to the actual problem (rather than your imagined problem where google were supposedly implementing something defined by Alan Turing. D’oh) since the abusers can change and adapt the ways in which they abuse services.

    It’s an evolutionary process, just like there’s no quick fix, solve all solution to spam or crime or whatever other anti-social behaviour exists.

    Are these Google captchas good? Well, the answer to that isn’t going to be a mathematical proof nor is it going to be some pedantic drivel about the meaning of the acronym. No, it’ll simply a pragmatic and subjective decision based upon the empirical evidence gathered after the fact - in other words Google will have to “suck it and see”

  20. RSnake Says:

    @Michael - I think we actually agree. When I said “vaguely good” I mean that compared against existing technology it’s not terrible, but it’s certainly not good either. But that doesn’t mean that no robot existing or conceived couldn’t do a simple human task. It just means we haven’t built it yet. That in my mind makes it vaguely good. Deterrence by the unwillingness of an attacker to bother is still vaguely good, but it’s not the same thing as a CAPTCHA (by it’s strict definition).

  21. DCC Says:

    Personally, I like the idea of honeypot captchas. No image, just the textbox, made invisible with CSS. Scripts would still fill the field out and since human users who legitimately want to use the form would ignore any hidden fields, you just check to see if there is any data in the hidden field, if so, you can safely assume its not a legitimate user and then ignore the form data. Obviously, someone “could” get around it, but it is going to achieve the desired result without requiring the user to need to deal with the captcha at all.

  22. anon Says:

    IMO people are not that much better than modern technologies at anything and technology will only improve while people will not be able to stay ahead of computers in any field for very long. So I don’t see a humanity test which might work in all cases… well other than some kind of fast physical DNA test used to log in :/ but that causes other problems.

  23. Damien Says:

    There’s no need for *any* one person to determine the upright orientation of the photos (neither a Google employee nor the website on which the image is hosted).

    All Google has to do is add one image for which it doesn’t yet know the correct orientation into the captcha process. The combined efforts of thousands of people orienting it will allow Google to determine the correct orientation, determine what margin of error (in degrees) is needed to ensure 99% of people get it right, and even to discard the images where there’s no good answer for what’s upright.

  24. RSnake Says:

    @Damien - that could be true if the images had a clear direction. If you’re looking at a picture of the stars, which way is up? So maybe the algo has a way to throw out images that never had a decisive answer? Either way that seems problematic.

  25. AviD Says:

    I think you’re skipping over one important point here: everyone who builds a “better” CRAPTCHA (Completely Ridiculous Attempt at Pathetically Trying to Curtail Heavy Abuse) is trying to solve *the wrong problem*.

    In truth, you don’t really CARE whether it is an unattended computer at the other end, or a young Indian being paid 2.50$ for every 1000 CAPTCHA images he “solves”, or some middle-aged American who needs to answer the CAPTCHA to get his daily serving of pr0n.
    What you DO care about, is that your site/application/service is not *abused*, specifically by mass overuse. Obviously, CAPTCHA *in concept* does not help with that - instead of asking, “are you a computer or a human?” the proper question should be: “are you a legitimate user of this site, or are you spamming/flooding/automating/attacking my system?”

    On the other hand, as @Michael said, ” Itís completely trivial to tell machines and humans apart”… if you’re a human: http://xkcd.com/632/
    hmm… Maybe the only real way to solve this problem, is with human interaction? i.e. if you can solve it, you’re human ;-)

  26. RSnake Says:

    @AviD - I’m not skipping over that - in fact it wasn’t my point at all. My point isn’t whether it can slow down some attackers, my point is that it’s not a valid Turing test. That’s all. I’m taking exception to the use of the term as being overly optimistic and not based in reality. It’s not a CAPTCHA, it is a test, but not a CAPTCHA - but I’ve actually never seen a CAPTCHA before, to be honest.

  27. AviD Says:

    @RSnake - so what you’re saying is, its an issue of semantics? :)
    While I do agree with your point - if we’re not clear on the concepts, we’ve already lost - my point was even more so: Even if we were to get the terms right, or find an actual CAPTCHA, it would still be the WRONG test.
    The test should be one of legitimate use as opposed to flooding.

  28. RSnake Says:

    @AviD - I could not have said that better myself.

  29. Carter Cole Says:

    i bet it would be easy to write something to find visual queues that the image is rotated or not