About a year back I actually broke down and read the Firefox source code (brutal, I tell you) and came accross an interesting cross site scripting vector based on a function called Non-alpha-non-digit. The function assumes that anything that is not an alphanumeric is ignored in certain contexts, because it’s probably a mistake. Of course that has interesting web application security implications because the characters can obfuscated the actual string that web application firewalls are looking for.
Deconstructing an HTML string can be super complicated, and I’ve said it before but it’s worth saying again, the differences in the way the rendering engines work make it murder for anyone attempting to look for all the vectors. Not many people have the same kind of patience to download all the browsers and test each one against each vector and fuzz them individually (which despite what you may think is a very painstaking and convoluted process that needs to be repeated 5 times for each test for each browser variant which is prone to missing things).
I had a string of emails with yawnmouth who brought my attention to the non-alpha-non-digit attack vector again. His are slightly different than the one I had posted before. Here are both of his:
<a href="#"www/onclick=alert(/xss/)>click me! (works in ie)</a>
<a href="#"onclick/=alert(/xss/)>click me! (works in ff)</a>
The first one is really very similar to the vector I already have listed on the page, so I’m not going to go into that one (and it doesn’t apply to non-alpha-non-digit anyway). But the second one is a great example of one 65000ths of the problem (I’m not exaggerating). Literally any char can be put between the event handler and the cross site scripting string in question.
Even worse, you can string them together and make it even harder to detect. Of course just looking for the event handler string “onload” would find this, but if you were attempting to parameterize the string into tokens, it is doubtful you would find what you are looking for with a non-alpha-non-digit string. Here’s an example that does not require user interaction to fire:
I updated the XSS Cheat Sheet with the vector. But as I was saying this problem is far bigger than just the chars that I’ve got listed. ANY character can work as long as it isn’t a letter [a-zA-Z] number [0-9] or a parameterization token, like angle brackets and single and double quotes. The interesting thing is that because Internet Explorer does use grave accents as a parameterization token filters should take it into account, but the Gecko rendering engine ignores it as a non-alpha-non-digit and therefor you must do the same. See? Browser differences make coding against cross site scripting extremely difficult.
Personally, unlike a lot of people I am completely browser agnostic (they all have their plusses and minuses), but things like this make my head hurt. Browsers need to come to consensus about which tokens are valid and which aren’t. Anyway, yet another browser trick to test which browser you are using (grave accents work in IE, and not in Firefox). To be completely honest, I use a very modified version of Firefox for my daily surfing, and Internet Explorer for Intranet surfing or when I know a site is safe and I want the enhanced features offered by Internet Explorer - or when something just doesn’t render correctly in Firefox. Let the browser wars continue.
Thanks to yawnmouth for making me take another look at this!