There is a concept in web application security that I’ve been toying with for a few years, based off of a common consumer feedback mechanism called waterfall analysis. Typical waterfall analysis is the rate of drop-off from page to page. The easiest way to think about it is a flow that has multiple pages in that flow. Each page represents some user interaction that is required (a registration flow is a perfect example). On each page, the ratio of users is calculated to help diagnose drop-off ratios. If you start with 100% of your userbase, some amount less than that will go to the next page, and some amount less than that will go to the second page and so on.
Now think about it from an attacker’s point of view. Let’s say you were able to separate your good traffic and bad traffic into two distinct buckets perfectly. Let’s also assume we only care about users who completed the flow (causing the function to fire) and so we are only looking at those users. 100% of both good and bad traffic make it to the function, each in their respective buckets. Here’s what it might look like:
When you put it like this it looks pretty obvious (this doesn’t account for users who hit errors and have to re-submit, or multiple path flows, but you get the idea). Using an inverse waterfall you can see that attackers tend not to enter your page through a normal page-flow. They don’t act like normal users because they don’t have to. They simply attack the critical infrastructure directly, performing the malicious function.
This might sound familiar because looking at flow analysis is an old trick used by products like AppShield and used in lots of anti-CSRF functions. But none of them look at it from quite this angle. They look at it from the opposite side, which is that you must have a credential to move on, which is good, because the inverse waterfall can be broken by the attacker simply following all the steps. But because an inverse waterfall does not give off any signals (no obvious cookies/credentials beyond what is required for the flow itself) it becomes far more passive. It is also easy to do regression analysis against your logs to identify possible fraud that has taken place after the fact, which you can’t do with a credential system since credentials are rarely logged. So while it does have flaws, it certainly does have some benefits too.
Anyway, just a thought.