Paid Advertising
web application security lab

Archive for the 'CAPTCHA' Category

Spammers Hurt The Blind

Sunday, May 4th, 2008

There’s an interesting link talking about the lawsuit that Rite Aid just settled regarding their accessibility issues. In part it was in regards to their in-store issues, but it was also about their online accessibility, specifically around CAPTCHAs. So I spent a little time doing some more research into other issues around CAPTCHAs and the blind and in fact there are even concerns around the audio CAPTCHAs for the deaf-blind users.

One thing that was interesting is that many of the sites that have been targeted for law suits and angst have been either online retailers or websites that are heavy text based websites (Typepad, Livejournal, etc…). I guess that makes perfect sense, I just hadn’t thought about it before. I would expect there to be a lot more of this in the future, so if you use CAPTCHAs I’d consider at least getting an audio version, as I’ve discussed countless times. An interesting thought though: spammers have made it harder on the blind. Yet another reason to hate spammers, I guess.

Malware Solving CAPTCHAs

Thursday, November 1st, 2007

There’s an interesting link on MSNBC about malware that’s trying to solve CAPTCHAs. Basically it’s using an ruse of a sexy girl who tempts you with nudity if you type in some letters/numbers. The letters/numbers are, of course, to social networking sites, webmail or whatever. Very clever, but also very stupid at the same time.

One thing we’ve seen actually is pretty clever. Malware has the ability to do a lot, including re-writing webpages on the fly. However, the goal isn’t just to re-write some banners (yes, sometimes that is the goal) but sometimes it’s to steal information. And sometimes it makes sense from an attacker’s perspective to ask for an additional piece of information (like a social security number) on a form. What I haven’t seen is adding an additional CAPTCHA to a page, which would be totally invisible to the average user (unlike a stripper on your desktop, which is sort of the opposite of subtle).

Good Articles on CAPTCHAs

Wednesday, August 22nd, 2007

Mark Burnett has a few good articles on my single favorite love-to-hate security measure, the CAPTCHA. Check the articles out here and here. They do a good job at explaining some of the high level problems with CAPTCHAs but don’t be fooled, this is only the tip of the iceburg as I’m sure Matt would agree. If you look on sla.ckers there is post after agonizing post where people are building and then breaking CAPTCHAs.

Jeremiah had a good post on this a year ago describing what makes an effective CAPTCHA. I’d like to go one further. I have actually never seen said mythical beast. I’m not even sure it can be done with the technology we have at our disposal. What I’m getting at is this. People have deficiencies and those deficiencies must be dealt with for them to be able to solve a puzzle. Some deficiencies are pretty dibilitating and include blindness. Okay, so we have audio CAPTCHAs to augment that issue. Then we have colorblind people. They too can use the audio CAPTCHAs.

Then we have things like pwntcha, pron proxies and a whole host of other ways to “break” CAPTCHAs in a way that they were not intended. Bummer. It’s getting to the point, where I cannot even fathom what a good CAPTCHA would look like. Everything is either far too hard for people to solve, or far too easy for computers to solve. The stuff that’s in the middle is usually bad for both. I’m up for an experiment. Can anyone point to a good example of a CAPTCHA anywhere on the Internet - one that meets all the rules outlined by Jeremiah’s post?

CAPTCHA Breaking Game

Wednesday, June 13th, 2007

As mentioned on Ronald’s blog and a rather suspicious digg entry linking to a referral code (indicating that the person who dugg this is somehow related to the site) there is a CAPTCHA breaking service located at decodetowin. The site claims to be running a sweepstakes and the only way to win is to “decode” the CAPTCHAs. Here is text from the site:

What is Decode to Win? Decode to Win is a contest website in which you decode graphical messages to increase your chance at winning a prize. You get one point for every message you decode. At the end of each week, we pick a random user from the top 15 point holders and send him/her a prize offering. In some cases, we will send prizes to more than one user.

No doubt, signing up adds your name to validated spam lists - they get you coming and they get you going. Interesting premise though. It appears that they are breaking Google CAPTCHAs by the looks of it, but it’s difficult to know for sure unless you are Google. One interesting thing I noticed as I was testing it is that the first one succeeds while the following tries always fail until you reload the flash file. It’s unclear why they do this, but my guess is that it is likely that people will try more than once, and it is unlikely that they will sign up. So it’s worth getting them to try three or more times to see if they simply typoed the second try. It’s out the folks, no one should doubt that CAPTCHAs definitely are being broken. Thanks to Ronald to pointing this one out.

reCAPTCHA Image Processing To Stop Bots

Friday, June 8th, 2007

A few weeks ago Ben Maurer posted a link to a service called reCAPTCHA that attempts to solve the spam problem in the typical CAPTCHA way while solving another hard problem at the same time. reCAPTCHA is a part of a project to scan old books. But part of the problem with scanning using OCR is that you get crap results sometimes. Therein lies the reCAPTCHA idea - replaying that odd looking text to users and getting them to type the answers in, next to a real CAPTCHA. Knowing that one is correct assumes that the OCR image is valid.

So the next question you have is that what if someone doesn’t answer the second question at all or puts in something erroneous - that’s okay it uses a voting system to make sure more than one person agrees (I’m not sure on the specifics of the voting system). That makes for a pretty interesting system in a lot of ways. However, one comment made by “Anonymous” on Ben’s site caught my eye.

“Chinese radio scare alert: these people want to exploit your brainpower with their captcha tricks! It’s like enslaving humanity, one word at a time!”

You certainly get extra points for originality of your idea.

I’m sure nobody will get my Chinese radio reference though…

What Anonymous is referring to is the Chinese Lottery. It’s a theory in cryptography where you can force many people to do very small tasks to get the answer to a bigger problem (in the lottery example force the government supplied radios to perform small parts of a very large crypto problem). For instance, if they can somehow ask users to perform a math function that is somehow more efficient for a user to do than a computer, then it makes sense. There is another similar theory using biochemical reactions in a DESasour, where each cell of an organism combines to perform a computationally complex task, but given the volume of cells in any sizable creature, it would have enormous computing power.

Granted reCAPTCHA is terrible at this - it is far more efficient to perform any mathematical task with a computer than anything a human could do. The only way I could see this being used in a nefarious way, other than the CAPTCHA proxy idea is if part of what a government needed to do was OCR classified documents (this could be even more effective in other languages where translation services are at a premium). While possible, it sounds like quite a conspiracy theory to me. But Anonymous can rest assured that someone out there understood his reference! ;)

CAPTCHA Proxy Service

Sunday, May 6th, 2007

One concept I have been playing with a lot lately is interesting ways to take the robot out of CAPTCHA solving, but still solving it subversively. Sure, we came up with the mechanical turk methods, the porn proxy, using kid’s games, and a variety of other low tech solutions. However, the other day, I came up with a concept for an actual service that does this. Let me explain:

CAPTCHAs or any automated Turing tests in general attempt to see if the consumer is a robot or not by throwing up an image to test if the human can read them. The reason why webmasters use them is so they can detect if the user is real or not. So webmasters have a need, and spammers also have a need. Webmasters want to detect if a user is really a person or not, and a spammer wants to solve those CAPTCHAs in whatever way is effective. So here’s the concept.

By setting up a central proxy with APIs for webmasters you can solve both problems at once. The webmaster gets to have unique CAPTCHAs by using the API to query the proxy. The proxy pulls a CAPTCHA from somewhere on the Internet that a spammer wants to break. The spammer uses their own API to decide if the consumer types in the correct password or not and sends back a decision back to the webmaster through the proxy. The webmaster then can allow the user to succeed or fail as they choose. The only motivation for the black-hat webmaster to do this is if they represent a lower value target than the websites that the spammer tends to attack and/or if they don’t care about other websites’ problems with security.

Of course this is entirely black-hat, and provides no good service whatsoever, but it does solve two different people’s problems at the same time. Of course this symbiosis does introduce latency by slowing the consumer down while they wait for the proxy and the spammer to validate the entry. Maybe a credit system would need to be put in place based on the latency time to ensure quality. This service exploits one of the two fatal flaws in CAPTCHAs - if it works perfectly although it can detect it is a person or not, it cannot detect their intentions (the second being that if it is created by a computer it can be read by a computer). Yah, evil, I know.

Solving CAPTCHAs for Cash

Friday, April 27th, 2007

I had a really interesting conversation with a guy out of Romania this morning regarding a team of CAPTCHA solvers that he has set up. The basic premise is that he has 5 guys set up to solve CAPTCHAs like Yahoo, MySpace, and Hotmail. He does this for clients all over the world. The economics are probably the most interesting part, since his team is non-technical and only types in what they are given by their clients.

The economics are as follows: 300-500 CAPTCHAs per person per hour. The clients pay between $9-15 per 1000 CAPTCHAs solved. The team works around 12 hours a day per person. That means they can solve somewhere around 4800 CAPTCHAs per day per person, and depending on how hard the CAPTCHAs are that can run you around $50 per day per person (his estimate). The reason it’s not higher is because they take breaks, and fail sometimes.

He also admits it takes some time to ramp them up on new CAPTCHAs. Eventually they get faster at solving them. So for $50 a day, you can get your own human CAPTCHA solver. The ages of the solvers range from 18-23 years old. Pretty interesting stuff - what a crappy job!

Vidoop

Wednesday, April 18th, 2007

In an interesting email that was sent to me I was asked to take a peek at a new software tool, not yet released to the public called Vidoop (there is an interesting article on it here). While I was unable to actually take a look at the software, I’ve got a pretty good idea of how it works from the Wired article. After downloading a software certificate that allows you to use their software basically you say, “I like animals” and it shows you pictures of horses and cats and dogs all mixed in with a bunch of non-animal photos. You choose the the correct photos (a la kittenauth CAPTCHA) and you are granted access.

So here are the major problems with this that I see. Firstly, it’s probably not accessible (meaning there aren’t alt tags on the images) because if there were it would take only a few guesses to get in since the computer could build databases of “like” things. So basically, like in kittenauth, the blind are screwed (which we have talked about a dozen times and I really don’t want to start another conversation on it, I’m just sayin’). Secondly, it’s non-portable because you have to have the software installed on the computer you want to use. That means you can only use it from one computer (forget going over to a friend’s house and logging in) and if that one computer gets hosed you need to find an alternate path for getting the software installed (which is often the least secure part of these systems). This type of design is a lot less portable than tokens and for a consumer tokens are nearly unusable too.

Also something that makes me uncomfortable from a security perspective is the concept of single sign-in. I’ve always thought single sign-on was a great usability improvement but often terrible from a security perspective. Like the old motivational adage - you’re only as strong as your weakest link - the same is often true with single sign-on. You are often at the mercy of the weakest security model. If any one site is insecure you can (in many of the cases of single sign-on that I have seen) end up compromising all the other trusted sites. Perhaps Vidoop has a great way to solve that issue that revolutionizes the way authentication works and never opens itself up for attack under any scenario. Without looking at it, there’s no way for me to know.

Lastly, because Vidoop uses a relatively small set of photos to choose from, there are only a few general choices from which to brute force (otherwise you’d run into overlap and false positives). If I know the target is a male, chances are they aren’t going to pick the fuzzy animals. If I know the target is a 13 year old girl, chances are they aren’t going to pick photos of computers or sports cars and so on. Anyway, you see the problems with this, Unlike passwords, which are user specific (and still guessable), this is highly un-arbitrary. Does it stop phishing, keystroke logging, cure cancer or any other magical things? I can’t say without looking at it. Will I be using it for large scale mission critical secure production installs? Doubtful.

Internet Explorer Accepts Style Attributes on Closing HTML Tags

Monday, March 19th, 2007

There’s a really interesting thread on sla.ckers.org talking about bypassing some fairly rigid anti-XSS vectors that allow nothing that looks like HTML. Specifically it doesn’t allow <[A-Za-z] which does limit the vectors pretty substantially. In the process of working through the attack vector Hong mentioned that an attack could surface inside of an end HTML tag. Here’s the example code:

</a style="xx:expression(alert('xss'))">

It gets around the filter because there is no letter immediately following the open angle bracket, it is a forward slash. I’m not exactly sure why any end attribute should be allowed to have style information associated with it, since that doesn’t really make sense contextually. This doesn’t appear to work in Firefox or Opera, but it does work in Internet Explorer, which makes up a vast majority of the browsing community. I wanted to wait until the exploit actually worked before posting it, as it was a very interesting way to bypass filters that probably wouldn’t have worked in any other way (with the possible exception of injecting nulls). Nice find, Hong!

Target Sued By The Blind

Thursday, October 26th, 2006

Once again, the blind are at it - wanting equality and accessibility. Those pesky blind people! No but seriously, this is really pretty important and although I am pretty anti-litigious I think the National Federation of the Blind is making a statement by suing Target. Yes I know I’ve mentioned this before, but I started thinking about this some more in the wake of this recent MSNBC article. Blind people cannot use the Internet in the same way people with vision can. They cannot “see” the page layout. One thing I haven’t talked much about is semantic relationships in HTML. It’s a very simple concept that eludes most people who claim to know HTML (at least they put it on their resume).

One of the major problems I see with the way HTML is constructed is tables. Tables are one of the most useful constructs in HTML. You put things in columns and rows, and it makes sense. The problem is that it’s not accessible. The way tables are constructed you read down the column instead of across the row. It’s easier to dump the contents of a select statement in SQL than put it into a multi dimensional array and output one row element at a time in order. Thus it is no longer semantically correct.

Let’s say I have a simple table that has this sort of data in it:

Name Age Sex
Alice 32 Female
Bob 53 Male
Cathy 38 Female

A person who is blind heard that as follows: “Name Alice Bob Cathy Age 32 53 38 Sex Female Male Female.” That’s not terrible with such a small list but when the table grows to many columns with many rows in it, it’s nearly impossible for the person to understand which person you are now talking about. If the table were re-constructed to be in semantic order it would make more sense, “Name Age Sex Alice 32 Female Bob 53 Male Cathy 38 Female.” I understand CSS has come to the rescue but with completely different look and feels and bugs depending on what browser you are using. My question is, why haven’t we invented a new table structure in HTML that is semantically correct? It’s not radical thinking, it’s a simple solution to giving accessibility and still allowing an easy standard way to display data in HTML.

Anyway, sorry, that was probably a tangent. The real reason I’m writing this post is to drive home the fact that the CAPTCHAs people have been using on their enterprise websites are going to get them sued unless they have an alternative. We’ve talked about this before, and I’ve been given the impression that people just aren’t sensitive to this issue by the very same people who built those CAPTCHAs. I wonder what it will take for people to realize it’s just not a good idea from a security perspective (porn proxies completely circumvent the value since you can trick people in any context to type in those CAPTCHAs for you) and from a legal perspective. Hell, it doesn’t even have to be a porn site that relays the CAPTCHAs to unsuspecting users, it could be a blog… a web application security blog. Hmmm…