Accuracy and Time Costs of Web Application Security Scanner Report
Larry Suto is back with another report outlining the differences between some of the top web application scanners on the market. Before you get all uptight and start flaming me, I in NO WAY sponsored, encouraged or had anything to do with this test in any way. In fact, I only found out about it a few days ago. Not that I think that’ll stop the flame wars, but just direct your ire appropriately, please! Anyway, he took a different approach this time, and instead of running the scanners against something he had devised up to be used only in his own lab, he turned all the scanners on each other’s public test sites. You might think they should all fair fairly well since they’re all public and there’s nothing stopping them from testing to their heart’s content. But that’s anything but what he found. You can download Larry’s report here.
Some of the more interesting findings were that Burp Suite Pro (an extremely cheap product built by Portswigger - and a damned fine manual testing tool, I might add) fared better than Qualys or WebInspect when trained. I always loved Burp Suite! Larry’s commentary is particularly amusing as you go through it, with choice quotes, like:
Accuntix missed 31% of the vulnerabilities after training and 37% without training. This is a significant cause for concern as they should be aware of the links vulnerabilities on their own site and be able to crawl and attack them.
And…
WebInspect missed 66% of the vulnerabilities, even after being trained to know all of the pages. They missed 42% of the vulnerabilities on their own test site after being trained and 55% before training.
NTOSpider by NT Objectives came out in the lead with the best overall score of the application scanners tested (which included Acunetix, Appscan, Burp Suite Pro, Hailstorm, WebInspect, and NTOSpider). He also measured things like how long the various scanners take to configure, support and so on - all important things for companies about to make the big investment. This isn’t all scanners everywhere (notably WhiteHat is missing as is the newest player to the field, NetSparker who incidentally took it upon themselves to add themselves into the report after the fact, and other free web assessment tools, like Nikto etc…), but it’s a great start to a long future of heavily debated research, I’m sure. Love him, or hate him, Larry’s always got interesting research to share!



February 3rd, 2010 at 2:35 pm
Shame NetSparker wasn’t included. All tho its a very new tool I find it very powerful, not sure how it would compare to others. The report was defiantly an eye opener.
February 3rd, 2010 at 2:48 pm
I did a search in this paper for “Link Extraction” and nothing came up. I’d like to see wivet.googlecode.com data, please. not mere speculations as to how the Javascript crawlers work! Wivet will tell you exactly how they work…
The `ultimate’ point-and-shoot is a home-grown self-service portal that runs w3af through optional passive analysis tools such as BurpSuite, Casaba Watcher, and ratproxy. Give that to your developers and run at system test time as part of builds (whether in a build/CI server or not).
I found the part about “Group 1 vs. Group 2″ at the end of the report to be the most interesting. I would think that Group 1 mostly consist of penetration-testers. Fortunately, Netsparker was released to be a webapp scanner specifically for pen-testing. Unfortunately, it wasn’t covered.
Group 2 usually consists of application security consulting companies that provide full app assessments. A lot of times a full app assessment means doing runtime, static analysis, and manual code review. Webapp scanners to these people provide some (but not all) automation for the runtime portion of their full vulnerability assessment.
In the case of Group 1 for developers/managers/netsecguys/etc, most of these people want point-and-shoot so that they can go back-and-forth with protections (e.g. fixing the code for developers, placing/testing WAF rules for IT/Ops guys, managers looking at risk over time on a graph, etc) without the need for an expert.
It would be interesting to hear what group you’re in. I’m in Group 2 “most of the time”. If there is a pen-test with specific goals and a stopwatch — you bet I’m in Group 1.
APT is probably in Group 2. What Group are you in?
February 3rd, 2010 at 3:21 pm
I just started reading the report and noticed the numbers for Burp in the first diagram are off. Based on the table data later on, the number of false negatives for Burp in the “Falsely Reported and Missed Vulnerabilities” chart should be 98, not 122.
I would pass this to Larry directly but I couldn’t identify any easy method to contact him. Anyhow, great analysis and methodology. This must have been a lot of work to put together.
February 3rd, 2010 at 3:56 pm
Ok, finished paper. Saw Larry’s email at the end. I already contacted him.
February 4th, 2010 at 3:21 am
I’m Bogdan Calin, working for Acunetix.
I’ve analyzed the report and some of these vulnerabilities were not found because Larry didn’t recorded or properly used a Login Sequence (so the scanner can authenticate on the website).
For example, I’m talking about our website: testphp.acunetix.com
The files userinfo.php, cart.php are only available when the user is logged in. I’ve just created a Login Sequence and scanned the website and the vulnerabilities were found.
However, this report helped us identify some bugs in our application and I’m working on fixing them as we speak. I have to thank Larry for his effort.
February 4th, 2010 at 4:16 am
I’m curious - did he deliberately misspell “Accunetix” as “Accuntix”?
February 4th, 2010 at 5:04 am
I’m reading this about BurpSuitePro: “It also lacks any
automated form population solution, and simply prompts the user to fill out any form it comes up
to, which on many sites would be quite a significant effort”
That is not true. Check out Spider->Options
There you have the ability to either “don’t submit forms”, “prompt for guidance” or “automatically submit using the following rules to assign parameter values”. And there you can tweak whatever you want by using regexp. You can also specify that unmatched fields will be set to “whateveryouwant”.
February 4th, 2010 at 7:56 am
Hey guys,
We have conducted the test and added Netsparker to the report.
I’ve tried to stick with the Larry’s format as soon as possible also there were some unexpected outcomes such as all 7 scanners missed some issues, Acunetix hardcoded some vulnerabilities in their websites etc.
Here it’s :
http://www.mavitunasecurity.com/blog/netsparker-accuracy-and-time-costs-of-web-application-security-scanner-report/
February 4th, 2010 at 10:15 am
Curious if anyone tested Rapid 7’s product.
February 4th, 2010 at 10:31 am
Hi Ferruh,
I work for Acunetix as well.
I can assure you that we did not hard code any vulnerabilities in our test websites, we don’t like to fool our clients. I do not think it is the right approach to blame someone without having any facts neither.
February 4th, 2010 at 12:30 pm
Ferruh is right about the phpaction parameter vulnerability. Initially, it was eval($_POST[”phpaction”]); but we had a lot of problems with people thinking that is funny to hack test websites (that were supposed to be vulnerable). I got tired of these people and hard-coded that vulnerability thinking that it will still show up in our scanner without providing direct php code execution to any script kiddie. However, it was not my intention to give an unfair advantage for our scanner. I wasn’t expecting that our test website will be used in a comparison against other scanners.
February 4th, 2010 at 12:44 pm
Bogdan, Robert,
I wasn’t trying to blame anyone, I was just trying to point out that test shouldn’t be in the report. Hope I didn’t sound other way around.
It’s obvious that it’s not planted there to block other scanners. Just to make it clear I update the blog post.
February 4th, 2010 at 2:05 pm
Thank you very much Ferruh
February 4th, 2010 at 5:58 pm
We used Rapid7. It’s great all purpose network and host vulnerabilty scans and reporting including asset reporting. it has baseline capabilityies as a db or web app scanner. It’s web scanning capabilities are further hurt by the difficulty in setting up authenticated scans. You cannot script the login.
February 4th, 2010 at 8:38 pm
What training was done for this report? Could this be published as well?
February 7th, 2010 at 3:14 pm
i use NTOspider most days and it has nothing on appscan, followed by acunetix.
February 8th, 2010 at 3:06 pm
This sounds like a perfect project for an Accredited University such as those listed at:
http://www.owasp.org/index.php/Membership#Current_OWASP_Organization_Supporters_.26_Individual_Members
to provide a research report around with controls and test subjects.. just sayin
February 9th, 2010 at 6:02 pm
I see that no OS vulnerabilities were included in these tests. I ran our scanner (Web Site Security Audit) on the 6 test sites and found server problems that are as severe as any web app issues. That was in addition to the web app vulnerabilities we found.
More than a third of all sites we are asked to scan have server weaknesses that are problematic. Shouldn’t any web scan test include the kinds of weaknesses that brought us blaster, code red or slammer?
February 10th, 2010 at 8:21 am
@Brian - I would agree with you to some extent. If there’s a way to break into the site, even if it’s not using the site, it should be in scope. But that doesn’t necessarily fit the requirements for a successful web application scanner either. I’d consider it more of a nice to have, than a requirement. But that said, not having a network/host security portion to your web application scanner means that you need to supplement it with something that can provide those services. I see many companies using multiple tools for different tasks, so that’s not at all uncommon. In fact, lots of times in assessment work customers will specifically ask us to avoid testing the host/network and just focus on the web application. I personally think that’s short sighted and I do my best to talk them out of it. But to answer your question, I don’t necessarily think a web application scanner must check for non web application based vulnerabilities to be a valuable tool.
By the same vein, lots of network testing tools don’t even try to do simple DNS enumeration (a la Fierce). Should they? I’d argue that it’s just as critical as any other part of an assessment to know what the surface area of the target is, but it doesn’t necessarily have to be part of the same tool. Again, it’s a nice to have to combine them rather than having them be stand alone tools.
February 10th, 2010 at 9:05 am
More comments from the vendors (if anyone knows any more please send them to me):
HP (WebInspect)’s statement: http://www.communities.hp.com/securitysoftware/blogs/spilabs/archive/2010/02/08/on-web-application-scanner-comparisons.aspx
Acunetix’s statement: http://www.acunetix.com/blog/news/latest-comparison-report-from-larry-suto/
WhiteHat’s statement: http://jeremiahgrossman.blogspot.com/2010/02/wheres-whitehat-re-scanner-comparisons.html
IBM (Watchfire)’s statement: https://www.ibm.com/developerworks/mydeveloperworks/blogs/paulionescu/entry/benchmarking_web_application_security_scanners10?lang=en_us
Cenzic (Hailstorm)’s statement: http://blog.cenzic.com/public/item/250026
February 17th, 2010 at 2:15 pm
I’m growing increasingly concerned with the continued attempts of commoditization in this space. While commodization is generally a good thing in terms of driving prices down and a sign of a maturing market, I believe there is a serious disconnect between customer expectations and what these products actually do. I hope this reports falls into the hands of as many people who are in a buying position as possible. It sheds much needed light on what automation really gets you. Whitehat’s decision to opt out of the analysis and their reasons for doing so are somewhat suspect as well. I would be interested in seeing a report that details what SaaS offerings at basement prices really get you. As they say, a pig wearing lipstick is still a pig and scaling crap testing to increase coverage is still crap testing.
February 24th, 2010 at 4:41 pm
NT OBJECTives’s statement: http://www.ntobjectives.com/blog/response-to-2010-suto-report