Invariable Parameters
Last night I visited professor Giovanni Vigna and his post-grad computer security research assistants at UCSB. Giovanni invited me to come any participate in a small Q&A session with his students. They are a group of very smart individuals and it was fun talking with them. Quite a few things were talked about, some things more interesting than others, but there was one thing I wanted to share because I thought it was very clever.
Giovanni and a few of his students started building a new WAF. Despite my back and forth love/hate relationship I thought there was one thing pretty innovative about their tool (granted I haven’t played with it personally yet). It was the concept of invariable parameters. In it’s most simple context, think about this example from a URL:
list-price=500&quantity=10&total=5000
Using machine learning, the tool begins to understand how certain parameters relate to one another. In the above example it is very trivial to tamper with the parameter and change the total price to something else:
list-price=500&quantity=10&total=1
Using anomaly detection it can see that the total has been tampered with because the product of 500 and 10 is not 1. Of course that’s an overly simple example, but you can at least see what they are talking about. Pretty interesting stuff. Although I still have my reservations I’m glad smart people are thinking about the problem. I had a great time with Giovanni and his class, and I hope to meet up with them again sometime!



April 20th, 2007 at 2:06 pm
While this is clever, at a 30 second glance I couldn’t come up with an example where relational fields couldn’t be programmed into the page, eliminating these extra variables.
Does anyone have a better example off the top of their head?
There is definitely a lot of unlocked power in anomaly detection.
April 20th, 2007 at 2:47 pm
I hope their invariants aren’t limited to the request, because the very existence of invariants there is an indication that the web application is flawed - any properly designed system should be free of redundancies. On the other hand, anomaly detection can easily mistake something for an invariant. For example, if a certain article in a computer shop is a computer that costs $5000, anomaly detection could easily associate quantity 1 with this article because nobody buys more. Then, one day some company decides to buy a stock of these computers and - bang! So something that is rare isn’t necessarily worth an error message.
April 20th, 2007 at 7:39 pm
Here’s an example of what they could catch: ‘ OR 1=1 , but also all possible variations such as ‘ OR 5+2=7. Gotta thank Bayesian statistics. Given a sufficient set of normal input (it’s safe to assume the majority of traffic to a web site is “safe”), the probability of catching something that deviates from the norm is very high.
April 22nd, 2007 at 9:19 am
It’s a nice idea but if I get the point, this won’t help that much:
The example is “list-price=500&quantity=10&total=5000″ but now if we want to change the values, we can do that for example like this “list-price=1&quantity=1&total=1″ and now the price is 1$ and the calculation’s still correct.
I also had some thoughts on the problem of “Invariable Parameters” but it’s only useful if you know before what the values for a request could be, but this is always the case when you send some possible values to the client which he should use (for example in a navigation). I wrote a blog posting about it, called “Protect your Web Applications through Encryption” which you can find here:
http://www.disenchant.ch/blog/protect-your-web-applications-through-encryption/57
PS: The algorithm I’ve used is absolutely crap but I’m working on that and it’s just a POC
Regards,
Sven
April 22nd, 2007 at 9:51 am
@Sven - as I said, this is an overly simple example, only to demonstrate a point. If you know itemid=12345 always corresponds to $500 you can get around that issue. Don’t take this example literally, it was only a way for you to get what I was talking about.
April 28th, 2007 at 8:57 am
This idea could be useful, but I don’t believe it can totally replace hand-made WAF rules. I am currently analyzing XSS/CSRF attacks on some web application - 2-3k detected daily, from which about 5% is human-generated.
Such self-learning WAF would not detect scanners/bot activity (as it is constantly present, and not a rare condition). At the same time, it would leave me with hundreds of alerts every day (probably the number of false alarms being most of them). I will know which ones were the most rare - that in many cases mean the most dangerous, but not with XSS attacks (where the same pattern from many IP addresses mean real trouble).
It would be great as an additional tool, but I don’t think it could replace application-specific hand-made rules, no matter how intelligent that self-learning WAF is.