Billy Hoffman and I had a really interesting conversation at the WASC meetup yesterday and I thought I’d share the jist with you, because I think it really deserves more thought from a wider forum - we’re smart guys but I think you all could help out here in giving your thoughts. The concept is stopping CSRF using a modification to modern browsers that includes an X-Header to tell the website in question in which context the user is being sent to the function in question.
The Theory: The concept is you have an X-Header that that tells the website how the user got there. It may look something like this (as an example):
X-Request-Context: a_href
Okay, so the browser tells the machine that the user clicked on a link to end up there. Seems pretty benign. Now let’s look at an attack:
X-Request-Context: img_src
Looks pretty bad to me. Somehow the user ended up visiting the function through an image tag. Looks pretty suspicious doesn’t it? How would that ever happen in nature (except in the case where you have dynamically created images, or someone typos a URL). In the latter case, it’s still bad so don’t allow the function to fire. Gracefully die.
My initial retort: The next thing I thought of is how is this any different than fixing the Referer field? Why do we need a new technology to tell us that a user is coming from somewhere they shouldn’t? Granted, the Referer field doesn’t work. It’s a) not there all the time and b) can be spoofed. But what if we could fix those problems? Billy was hard pressed to come up with an answer, as was I. So I left the conversation be, and slept on it. Then I came up with a counter point.
My objection to my objection: There are situations where a function can be called from the correct page without the user meaning to. If I send the user to an otherwise benign page that has both a) a link to the function in question and b) an XSS hole, I can make the user have the correct Referer but the context would still look wrong:
X-Request-Context: iframe_src
Hmm… okay, so maybe request context is interesting afterall?
My counter-counter objection: So now all we need to know is that the context is valid. But wait, in the case of XSS it’s still possible for me to click on buttons in JavaScript. So even if the JavaScript on the page is invalid, both the Referer and the context would be correct:
X-Request-Context: form_submit
So it doesn’t look like that’s very bullet proof. Not to mention that a developer still would need to know to use this and code for it. That’s requires them to know what XSS and CSRF is, and if they already know that, there are probably better ways to mitigate this. I feel like my brain is spinning out of control here - I’m arguing with myself.
Some hope: Now let’s look at what I think MAY actually have some real value. Two technologies are giving us a lot of problems right now. XMLHTTPRequest and Flash. Both allow us to rewrite or add headers (depending on browsers and versions, yadda yadda). Both of those could provide their own context to allow us to know when they are being used. IE:
X-Request-Context: xmlhttprequest
Or…
X-Request-Context: flash
It’s very uncommon for users to want to allow XMLHTTPRequest unless they know that they are outputting some XML data. Likewise, when do you want a Flash page to link to you? Not too often, I’d say. So maybe there is some small incremental value in this technology concept, but I would rather debate it in an open forum.