ha.ckers.org web application security lab

XSS Worm Analysis And Defense


By RSnake

Date: 01/10/2008

Abstract: This paper is the product of a week long contest to write the most diminutive self replicating XSS worm. This was a controversial contest, where there were no prizes awarded, and the goal was to build the worm that would fit in the fewest bytes possible. In the end there were two people, Giorgio Maone and Sirdarckcat who tied the contest at a stunningly small 161 byte cross browser compatable XSS worm. However, the journey was as interesting as the result.

To date there has been little to no research done in the methods of propagation and optimization. The lack of research is in part due to the relative infrequency of these worms in the wild, as well as scarcity of real-world worms. Each example found had three problems with them. 1) They contained site specific code 2) they contained obfuscation for filter evasion and 3) they contained a payload. Also, cross browser compatibility was not always present, making it harder to diagnose exactly what propagation code looked like. Rather than attempt to triage code that was not designed with these issues in mind, a contest was built to gather sample code that could be used for more in depth analysis.

Browser companies have not yet, to date, constructed a way to display user submitted content in a way that protects the website and users from malicious behavior, but still allows feature rich user submitted content. People sometimes say that certain social networking sites don't write secure code. That's not always the entire truth. Often the code is highly secure and could be made more secure with the flip of a switch, it's only that the business landscape requires the code to do insecure things for economic and user satisfaction reasons. Those things combined require that we search for alternatives to the problem, and give ourselves opportunities to build more secure websites based on our findings, having seen the output of the diminutive proof of concept worm code.

Assumptions: The theoretical social networking site we will be discussing in question is vulnerable, not because they are unaware that they are vulnerable, but because consumers demand rich content. It is assumed that companies want to do the right thing, in terms of security, but often cannot as a result of the business rules by which they must obey. It is also assumed that the attacker has prior knowledge of the domain. The code that they built must fit within a certain length field (like a name field, or a title) and will be rejected if it exceeds that length (there are reasons this limitation was put in place that will be discussed later in the paper).

Worm Problems: There are some major issues when building a self propagating worm which were intentionally avoided using contest rules. One of the most important and difficult rules to abide by was the issue of growth. One of the rules stated that the worm could not grow in length once it was submitted. There are three reasons this rule was critical to adhere to.

The first reason is probably the simplest and also least likely to be a real issue. HTML input lengths, code based data size limitations and most importantly database field lengths all may introduce hard size limitations of only the size of the worm. The second semi unlikely reason that this would be a factor is the worm author may want to limit the negative impact of the worm until it had reached maximum propagation by reducing the utilized database space increase, and bandwidth required for propagation. The latter also has the benefit of increasing the rate of propagation.

The last and single most important reason to limit growth was this example by Ronald:

While Ronald's example could have actually won the contest, sans the growth rule, it had the flaw of linear growth. Let's use an illustrated example using smaller byte count for demonstration purposes only. First, let's assume there is a hard limit of a database field of 50 characters, however, instead of rejecting the content it simply chopped off anything greater than 50. A sample page may look like so:

Site content here - 17 chars
attacker's vector here - 23 chars
More site content here - 23 chars

So after the first iteration of worm propagation, the site would chop off anything greater than 50 chars, which has the effect of chopping off the last part of the page. That's not an immediate problem, because the vector is still there. Let's look at the second page on which the worm was now posted to:

Site content here - 17 chars
Site content here - 17 chars (original variant's header)
attacker's vector here - 23 chars
More site - 10 chars (chopped off from the first iteration of the page)
More site content here - 23 chars

In the next iteration the last 7 bytes of the worm code would be chopped off as the site content continues to grow, which will break the attacker's worm. It would now look like this:

Site content here - 17 chars
Site content here - 17 chars (second variant's header)
Site content here - 17 chars (original variant's header)
attacker's vect - 16 chars (broken vector)
More site content here - 23 chars

Even if the vector worked without the missing seven bytes, on the next iteration of worm propagation it wouldn't because it would cut off the next 17 bytes, leaving only headers of pages from previous propagation. So in this way, a hard limit of n bytes with linear growth before the worm begins will eventually cause the worm to discontinue propagation, unless there is no other content on the page prior to the worm.

Note: The word growth is not really correct, as in reality it will continue to stay a static size (the maximum that the script allows) so although the information before the worm is linear growth until code breakage, the actual content submitted does not grow beyond the limits of the website.

Ultimately, though, the reason to limit length though was to reduce the three things mentioned earlier - obfuscation, site specific code and payloads. This leads us to the next issue. As mentioned before there is a theory that to make a worm small it will inherently become obfuscated due to the coding tricks necessary to reduce the size. While this is mostly a red herring, because this is not the same form of obfuscation referred to (rather filter evasion obfuscation) there were examples of this that caused one of the other issues that we had attempted to avoid (the site specific coding issue).

One of the rules of the contest was that the submissions must POST to "post.php". The goal here was to give them a requirement of posting to another page than the page they were on. Early on, oxotnick asked a good question; which is should we assume that the page you are submitting to is in the same directory or relative to the base directory. For the purposes of the contest, people were asked to assume they were in the same directory, however, this and the contest naming conventions used within the rules caused an interesting site specific coding optimization:

While the code is entirely valid for the rules, in reality, it's not portable. If the name had been anything but a string beginning with the word "post", like the word "test" this code would not have functioned. So while this code is interesting from an optimization perspective, it needs to be ignored for analysis.

Worm Best Practices: At one point Ronald pointed out that images may be a better universal vector for XSS worms, than things like iframes, or scripts. This may be a true statement, given that many sites do allow images by default. Without any real numbers to back this up, it's speculative, but possible. At the very least, the visual fingerprint is less without sizing. Ultimately, in another thread DoctorDan proposed an interesting question regarding wheter XMLHttpRequest is a better propagation method than the other prevalent method, which was submission of a form.

The benefits of XMLHttpRequest are many. Firstly, it's much more silent, because it doesn't actually force the user's browser to visually change to another page. It's not just visually silent though, as the auto-submit method also can make a clicking sound if the user's browser is set up to do this (which is often default). Also bwb labs had a great point regarding a looping effect of the submission method. Let's take a specific example of a site that upon submission automatically shows you the content you just submitted.

In doing so it would show the victim the payload and would automatically post the content back once again - putting the user's browser into an infinite loop. While this type of setup isn't universal, it was worthy of note, and could easily lead people to be more interested in using the XMLHttpRequest method for propagation. For the purposes of the contest, submission based worms were not forbidden, as there are many sites that don't have this setup, and even if it may spiral a user's browser out of control with submissions, that may be the attacker's intent, or it may be inconsequential to the attacker.

Worm Defense: Two interesting problems stopped a number of variants, which resulted in a new rule during the contest. While they are not considered good worm defense, they did stop a number of worm variants, requiring further work. The first is the declaration of onfocus and onload event handlers in the body tag. Also early on ritz found a problem with his code on pages that had a DOCTYPE assigned. So it would appear, using these would stop a certain amount of worm variants. Clearly both of these issues were eventually worked around, after the new rule requiring the worm code to work despite them. Either way these issues were worthy of mention.

Let's take a step back for a moment. The above comment, "Browser companies have not yet constructed a way to display content in a way that protects from malicious behavior, but still allows feature rich user submitted content" is an over statement. In this case, the browser companies have provided a single useful tool. In fact it is a fairly powerful tool - the iframe. This is not in reference to the on-page iframing that was proposed with content restrictions. This is in reference to the normal off domain iframe.

One of the reasons people don't use iframe is because they are concerned about the search engine value of the page they are constructing. If a company dices up their page into iframes, they will lose search engine value. The major search engines of the world haven't figured out a way to keep SEO (search engine optimization) value the same when you split your page into two different pieces (the protected content on your domain, and the potentially dangerous user submitted content on another domain). The one exception to that rule is using cloaking - where you display both the site content and the user content on the same page, only when the search engine spiders your site.

Note: Google has been hypocritical about the corporate opinion on cloaking, telling small companies it is not okay, and telling enterprises it is. So Google has created an unfair market place and as such use of cloaking is potentially dangerous to your business, given Google's pre-disposition to blacklisting based on that. Cloaking is deemed blackhat SEO based on rules that are still, as of yet not communicated publically; at least not in their entirety. The use of cloaking, even to protect consumers may get a website banned, so its use is not recommended unless there is an agreement made with Google prior to implementing this technique.

To get real value out of this worm analysis we shouldn't ignore our history lesson. The first and biggest XSS worm in history, was the MySpace Samy worm. One of the site specific things that Samy wrote in his worm was the following:

The reason this is important is because Samy was trying to overcome a basic principle in browser security - the same origin policy. Samy knew that the bulk of his code, which used XMLHttpRequest wouldn't work unless he switched domains to the one that allowed his worm to function. This leads us to the next part of defenses. One thing that has been mentioned a lot in defeating cross site request forgeries (CSRF) is the use of a nonce, or a one time token. Nonces can be read if they are on the same domain, by XMLHttpRequest. So it would stand to reason that it is better to omit a nonce on another domain that sites know are completely free of XSS vulnerabilities. That is because XMLHttpRequest must obey the same origin policy.

Note: Please note that Mozilla has discussed a cross domain version of XMLHttpRequest. It is unclear if this would open this up further to attack or not, but clearly any technology that allows for cross domain reading whether intentionally or not would break not only this technique but many other security protections built into websites.

So it would make sense to omit the nonce in a button on another domain since that domain is not readable by the JavaScript worm. This lends itself to a different sort of attack, like the one against Google Desktop where an attacker floats a small or even invisible iframe beneath the user's mouse. So the button can still be pressed. Although this does require some user interaction mouse clicks are so commonplace, it wouldn't be any surprise if this were used in a real worm.

However, if there is an anti-framing script that detects if the post page is being framed and then un-frames itself before the user has an opportunity to be subverted, that could easily protect the page from being clicked on. There is, however, a snag. In Internet Explorer iframes can be tagged with security=restricted. This turns off JavaScript on the iframe. That would stop the script that did frame detection from running.

Note: Please note that Firefox does not have an equivalent to security=restricted in off domain iframes, so this technique will work as described for Firefox users.

Although it appears all hope may be lost, there is an additional possibility. If the button is not a static button, but rather a button that itself is at least partially rendered using JavaScript, an attacker's use of security=restricted will not only cause the page frame detection script to fail to load, it will also cause the button to not include the nonce. This is problematic for users who don't have JavaScript installed though, so an alternative must be provided. For those users, the button instead may point to a login page, where the user is requested to log in to post their user content. This JavaScript alternative is only a minor inconvenience and effects approximately 0.1% of user population (based on the number of users who surf without JavaScript enabled).

Note: It is important to address the non JavaScript version of this technique due to concerns over lawsuits initiated by the National Federation for the Blind on behalf of the Americans with Disabilities Act.

This technique, combined with some sort of state management between the two domains should help thwart XSS worms. It must by restated that this paper is authored with the assumption that dynamic user content must be allowed for the site's business operations to continue, but that all other site code is designed to be as secure as possible.

Note: Please note that this technique does not protect against any browser bugs that allow a browser to break the same origin policy (Eg: the now-defunct mhtml bug or DNS rebinding).

Items Not Covered: There are several items missing from this paper. They include things like filter evasion, payload and command and control. Both filter evasion and payload have been covered by countless posts over the last several years in various forms and forums. Command and control is still widely debated, and no concrete observations can be discussed at this point beyond some of the requirements for a solid command and control structure for polymorphic XSS worms and for worms that have delayed payloads, based on a certain density of infection.

Additionally, while there has been some talk about tracking worm activity and at least one project attempting to help mitigation with the assumption that any potentially dangerous content is disallowed, this paper did not discuss either of these issues as they are far more in depth and will almost certainly require more thought and research by a larger audience.

Summary: While XSS worms are hardly a solved issue, some of the findings of the diminutive XSS worm replication contest definitely help construct the solutions outlined in this paper. This is anything but a definitive list of all ways to thwart XSS worms, but it should be a good primer on some of the findings that came from the XSS worm contest, and should help. Without browser modifications, this appears to be the best software agnostic solution to worm propagation, however, no doubt revisions on this technique and others will yield better results.

Thanks: Special thanks to everyone who helped contribute to our understanding of worm propagation over the last week (in chronological order): .mario, thornmaker, digi7al64, Gareth Heyes, Matt Presson, sirdarckcat, ritz, Alex, barbarianbob, BlahBlah, arantius, bwb labs, ma1, Spyware, Reiners, Spikeman, Ronald, Torstein, dev80, amado, shawn, hallvors, DoctorDan, oxotnick, dbloom, Kyran, tx, 4909, beNi, backStorm, badsamaritan and anyone else I may have missed. Without these people and their talent, this research would never have been possible. Also thanks to thrill and id for providing edits and feedback on this paper.