I was thinking about the KittenAuth CAPTCHA since I messed with it over the weekend a little. As I said, the number one issue with that particular system is the low order of possible solutions. It’s not about finding the right kittens, necessarily, but it’s also about the probability of getting the right answer. If you just guess an answer, the probability is 3 over 9 times 2 over 8 times 1 over 7 = 6:504 odds (given 3 correct values in a set of 9), compared to a normal CAPTCHA of say 6 numbers would be 1:999999 (just a tad worse odds there). The other problem with it is that it has such a small set of photos.
I did some cursory research on Yahoo images and Google images, and I found that Yahoo had a far superior data set of actual kitten images than Google, although Google reported half the data set of images it was also less accurate in what it was finding for the first 20-30 pages. If you were to use Yahoo’s data set there would be very little pruning needed, where if you used Google’s image search you’d be removing things like a band named “Atomic Kitten” and some Melborne based transvestite named “Kitten”. The point being, even if you could gather such a set, and prune it, then you’d still be at the mercy of a robot who could accurately gather all of the images off the internet with the name “Kitten” and get such a large data set to compare against that it would be broken again. But that’s leaving out Bayesean heuristics.
There’s a company called MessageLabs that uses something beyond pixel by pixel comparison and even beyond pixel color densities to determine if something is porn (those are the most common method of content filtering and also very flawed). MessageLabs also verify what is in the photo. For instance they can tell what a hand is, or what a car is or what a sky is, so they are less likely to see something like a flesh colored door or a baby picture or something more grey area like a swimsuit photo at the beach as porn. Using something like this against KittenAuth could prove to completely break their system - as if it weren’t already broken enough the way it is today.