Paid Advertising
web application security lab

Non-alpha-non-digit XSS Part 2

About a year back I actually broke down and read the Firefox source code (brutal, I tell you) and came accross an interesting cross site scripting vector based on a function called Non-alpha-non-digit. The function assumes that anything that is not an alphanumeric is ignored in certain contexts, because it’s probably a mistake. Of course that has interesting web application security implications because the characters can obfuscated the actual string that web application firewalls are looking for.

Deconstructing an HTML string can be super complicated, and I’ve said it before but it’s worth saying again, the differences in the way the rendering engines work make it murder for anyone attempting to look for all the vectors. Not many people have the same kind of patience to download all the browsers and test each one against each vector and fuzz them individually (which despite what you may think is a very painstaking and convoluted process that needs to be repeated 5 times for each test for each browser variant which is prone to missing things).

I had a string of emails with yawnmouth who brought my attention to the non-alpha-non-digit attack vector again. His are slightly different than the one I had posted before. Here are both of his:

<a href="#"www/onclick=alert(/xss/)>click me! (works in ie)</a>
<a href="#"onclick/=alert(/xss/)>click me! (works in ff)</a>

The first one is really very similar to the vector I already have listed on the page, so I’m not going to go into that one (and it doesn’t apply to non-alpha-non-digit anyway). But the second one is a great example of one 65000ths of the problem (I’m not exaggerating). Literally any char can be put between the event handler and the cross site scripting string in question.

Even worse, you can string them together and make it even harder to detect. Of course just looking for the event handler string “onload” would find this, but if you were attempting to parameterize the string into tokens, it is doubtful you would find what you are looking for with a non-alpha-non-digit string. Here’s an example that does not require user interaction to fire:

<BODY onload!#$%&()*~+-_.,:;?@[/|\]^`=alert("XSS")>

I updated the XSS Cheat Sheet with the vector. But as I was saying this problem is far bigger than just the chars that I’ve got listed. ANY character can work as long as it isn’t a letter [a-zA-Z] number [0-9] or a parameterization token, like angle brackets and single and double quotes. The interesting thing is that because Internet Explorer does use grave accents as a parameterization token filters should take it into account, but the Gecko rendering engine ignores it as a non-alpha-non-digit and therefor you must do the same. See? Browser differences make coding against cross site scripting extremely difficult.

Personally, unlike a lot of people I am completely browser agnostic (they all have their plusses and minuses), but things like this make my head hurt. Browsers need to come to consensus about which tokens are valid and which aren’t. Anyway, yet another browser trick to test which browser you are using (grave accents work in IE, and not in Firefox). To be completely honest, I use a very modified version of Firefox for my daily surfing, and Internet Explorer for Intranet surfing or when I know a site is safe and I want the enhanced features offered by Internet Explorer - or when something just doesn’t render correctly in Firefox. Let the browser wars continue.

Thanks to yawnmouth for making me take another look at this!

14 Responses to “Non-alpha-non-digit XSS Part 2”

  1. David Ward Says:

    Thanks for linking to my blog, man. That’s pretty cool of you. Web sites like those hurt my head as well, since I am also browser agnostic. I use both Internet Explorer and Firefox, though I favor Firefox between the two. I posted those links for those who come to the blog and didn’t like Internet Explorer.

  2. phaithful Says:

    OMG! You freak’n read the Firefox source code?!

  3. DanielG Says:

    I hope phaithful is being sarcastic because:

  4. RSnake Says:

    Thanks, David, NP about the link… I just try to post to a few relevant blogs out there - cuz believe it or not, mine isn’t the only site on the web. ;)

    phaithful, yah, I read the source for the rendering engine. It’s really complicated, as if you probably wouldn’t have guessed that, but it’s also pretty rock solid. At the point I read the Gecko source, I was really on a kick to find lots of unknown vectors at the time, and I did end up finding a few, but I didn’t find nearly as many as I did reading the HTTP spec and all the associated RFCs. I wouldn’t recommend reading the RFCs first, because it turns out none of the browsers are 100% compliant anyway. It’s better to take a guess, go with it and then validate it once you already have it figured out in terms of browser compatibility with the spec in question. Ugly!

    That’s part of why fuzzing is such a slow process, despite what most people think about it. I wouldn’t ever release my fuzzer to the world because it is only about 20 lines of PERL. It’s such a manual process. I wish there was a better way, but unfortunately almost al the vectors I find break at least one basic rule, that would make finding it extraordinarily difficult programmaticaly.

  5. Kyle Eslick Says:

    Thanks for the link!

    Great analysis. I subscribed to your feed and look forwarding to reading your content.

    I’m a Firefox junkie (which should be obvious!), but I have been playing around on IE7 and can see where it has its uses for certain sites, etc.

    Keep up the good work!

  6. RSnake Says:

    Thanks, Kyle! Yah, I could tell, you preffer Firefox. I definitely use it. But if you are just interested in browser compatibility with websites, Netscape 8.0 is an interesting alternative since it can switch between IE and Gecko with the press of a button. Of course IE tab in Firefox does the same thing. As I said, all the browsers have their own pros and cons in usability and security.

  7. Dave Says:

    I’m a firefox user, too. Since IE Tab exists, I deleted all Shortcuts for the IE.
    What’s another nice thing is the eval() command in CSS. Did you know that almost no email plattform check’s this thing in HTML mails?
    …now, you do ;)

    Today, I made a small interactive XSS Workshop… more a test. Nothing for you RSnake, but pretty good for beginners, I think.
    Can get it at ( is my german-speaking blog).

  8. RSnake Says:

    That was actually kinda fun, Dave, thanks… although I think the 5th test is broken or has the incorrect passord or something (same as #4) and it won’t accept it. Fun though! You might want to give some more hints (or spoilers or something) for the users who are new to saving files and editing them, etc…

    Personally, once I knew the JS had the password in it, enciphered or not, it was pretty trivial to use burp proxy to inject the JS decipher function string into the page upon return (kinda cheating, I’m sure, but hey, there weren’t any rules posted!). ;) You could actually limit that kind of cheating by not even omiting the correct password unless they use the correct injection, but I’m not sure if that is going to make them better at XSS or worse not to use the tools available to them.

  9. Dave Says:

    RSnake, that’s a problem in the firefox. A reset() on a form doesn’t reset the hidden input fields, and when tried to decipher the password, the value changes (cipher.js wasn’t written by myself).

    The problem with the password thing is, that there are always more ways to inject the script… I cannot check them all.
    But you’re right, I should write some rules ;)

  10. Albert Says:

    hey i was looking at dave’s challenge, can anyone offer any hints on how to inject an event into the code with strip_tags() enabled.

  11. RSnake Says:

    Albert, I think I’d need more detail to help you. It was several weeks ago I looked at it, and I don’t think that ever became a problem for me but then again, I was cheating (submitting through a proxy that I had control over) so…

  12. Albert Says:

    Well when you input tags it strips them automatically. Entering anything with a ‘

  13. RSnake Says:

    In this case, I don’t think that ever became a problem for me, like I said. I used Burp Proxy to watch the data being returned to the browser. Before I let it render, I injected the JavaScript that it required me to to get the password. Simple enough. But there are a number of exploits that don’t use any quotes on the cross site scripting cheatsheet.

  14. Albert Says:

    Oh the tag doesnt render properly here i meant to say < or >