Cenzic 232 Patent
Paid Advertising
web application security lab

XSS Scanning With Browser Emulation

This is an idea I’ve shot around many times - the concept of using a emulated browser to detect cross site scripting. Well it looks like I wasn’t alone in my thinking. Rapid 7 has built a new JavaScript/DOM/Browser emulator that can help sniff out those pesky XSS exploits. I’ve actually had passing conversations with the SPI guys and I know they were at least thinking about something similar but I’m not sure if it’s been deployed yet or not.

There are some obvious pitfalls in this. It’s only a snapshot so you must run it over and over again. It can’t emulate all browsers because it is probably written with something like SpiderMonkey which only emulates Firefox. That leaves a pretty gaping hole in the things it can detect accurately. And probably the most important issue is that to detect things accurately it really should download every object on the page, which makes something like this deathly slow and bandwidth intensive.

But overall I really support this type of scanning. It’s at least solving some of the major issues that most scanners don’t find with things like obfuscated dynamically generated JavaScript, and very odd CSS obfuscation issues. That said, I’m still not sure this is the right answer, given the issues I mentioned above. But I’m glad smart people are thinking about it. But just a warning to all scanning companies I’m sure I have prior art somewhere from almost two years ago so no one had better patent this. I’d rather it be out there so everyone can use it and come up with better ideas on their own.

10 Responses to “XSS Scanning With Browser Emulation”

  1. nEUrOO Says:

    Okay… too late for the idea!
    I’m currently making a tool (a minimalist web apps scanner) and tried to find a python binding of gecko, for JavaScript execution…

    Actually, I really think this is a good solution for decrease the false-positive number of Web Apps Scanners in XSS

  2. RSnake Says:

    Yah, I agree.. I definitely think it’s an idea worth thinking about for a number of applications. I don’t think you should stop looking into it, because I doubt the scanners are going to open source theirs. The more scanning technology that’s out there the better. Where would we all be without nmap?

  3. MightySeek Says:

    Weve been doing this for about a year with NTOSpider, and it can be done without the negative network performance impact you mentioned, by caching alot of the re-used JS files. But it does have the limitation you mention about being browser specific, which is why we have a versions for both IE and Moz which we check each page thru. I think it gives us pretty good coverage.

  4. RSnake Says:

    I think so too… those are the big ones. You won’t miss much (some but not much) by not covering Netscape, Opera etc… Do you upgrade as newer versions of the OS come up? For instance the difference between IE 6.0 and 7.0 is fairly dramatic.

  5. LabsMan Says:

    Your right RSnake, we shipped this feature with our XSS Intelligent Engine in WebInspect. Before we flag cross site scripting as a vulnerability we execute the javascript and confirm that it actually executes. This has drastically improved the false positives in scans. This is definatley the way to go, it saves the pen tester time when verifying results from the scanner.

  6. RSnake Says:

    Interesting. I’ve heard mixed results in some other people’s tests that I’ve heard about that some of the tests on the XSS cheat sheet for instance can actually lead to other types of vulnerabilities inadvertently - like DoS, broken SQL and other issues. I know at least one scanner out there is using very short and totally benign strings to see if they are echo’d back at all, and then using humans to review the detail. As a performance optimization it makes sense to do it the way you’re talking about. I’m only speaking as a third party hearing some details in passing. I’m sure you wouldn’t do it if it were costing you more time.

  7. Trevor Jim Says:

    You might be interested in our project, BEEP:

    http://www.research.att.com/~trevor/beep.html

    It uses the browser to detect scripts and relies on a policy set by the web site to know which scripts are OK, and which ones are not.

  8. Acidus Says:

    Using a JavaScript parser to test for XSS execution drastically reduces false positives and by using IE’s JS DLL or SpiderMonkey you can detect all the quirks too. Reducing false psotives helps everyone in the industry so I’m glad multiple vendors have seen the light.

    Since now everyone has solved detecting if an XSS executes, the problem comes down to making sure you are sending the right attacks. Sure you could send a few dozen (potentially dangerous) test probes for the XSS cheet sheet. One solution that appeals to some of our mutual friends if human intervention, but is better suited for professional services instead of automated scanners. One solutions is a brute force pre-compiled list of various attack strings with prefixes to escape HTML comments, attributes, etc, al wrapped inside a bunch of encodings. If only attacks strings could be dynamically generated based on application feedback… :-)

  9. nEUrOO Says:

    I know that nowdays we do not use lots of Applet Java, but they still exist…
    And by the way, an Applet Java can interact with JavaScript objects.

    Do you guys have done some thing on this point ? Do you load a JVM to check this applet, see the possible interaction with JavaScript (so possible XSS, SQL Injection etc.).

  10. RSnake Says:

    Trevor, this is pretty interesting and very similar to some of the Content-Restrictions stuff I was talking about with the Mozilla guys a few years back: http://ha.ckers.org/blog/20060601/content-restrictions-and-xss/

    Acidus, perhaps there is a waterfall methodology we could employ to make your detection more automated. Taking human intelligence and applying it to a waterfall of output as the page changes. It would be terribly complex but I think it’s doable. I programmatically enter some string of characters, I expect some output, I get something different. Based on that difference I try something new, expecting some output, I get something else, and so on. Until you have exhausted probable exploits or you have a human intervene.