Cenzic 232 Patent
Paid Advertising
web application security lab

Defensive XSS and SEO

While trying to answer a question on the web application security forums, I was going back to find a possibly archived link on the Internet Archiver. While playing around with Internet Archiver I noticed they had a pretty boring XSS exploit in them:

Here is the XSS exploit. As I said, pretty boring. But then it occured to me there is an old SEO (search engine optimization) trick that uses this engine. First, let me start by saying, I know there are ways around this - like turning off JavaScript. It was just something that sprung to mind.

One of the ways SEOs make money is to buy recently expired domains. Then they go to the internet archiver and input that domain. They crawl the entire internet archiver for all the content on the site, and then restore it back to it’s original glory. The external links are still valid, and the pagerank slowly goes back to the way it was before the site changed ownership. Once the pagerank is good and high they can switch it to whatever they want and monetize it as they wish. The original owner is powerless at this point.

But it occured to me, if I have control over the output of the page, I could theoretically change what the scrapers see. I could add external links, or change them around. If I thought they were going to do it by hand, I could include JavaScript to make their day bad, or something as simple as spawning an alert to me elsewhere using a remote JavaScript include. It only relies on the original URL string containing that malicious code - and who’s going to include XSS on every single URL?

But then it got me thinking, there are other (probably easier and more effective) ways to do this, including IP delivery to the Internet Archiver bot. You could put up heavy drug spam stuff, so that if someone ever steals your domain they’ll be hosting spam instead of your original content, or running malicious JavaScript or copywrited materials. All of which preclude them from having a legitimate looking domain based on your content.

5 Responses to “Defensive XSS and SEO”

  1. backtick Says:

    I made some suggestion on the www-html to use a client-side XHTML attribute to mitigate XSS attacks an it was captured in on a blog entry. Do you think it could be of any help handling the current XSS issues (in a future implementation)?

  2. RSnake Says:

    Hi, backtick, I think this is very interesting, although one of the major issues I’ve seen with these types of ideas is now you need to make sure they can’t input a </div> or the security falls down. This is certainly worth more talks. Have you spoken with the Mozilla or Microsoft folks at all? I could probably get you in touch with a few of those guys.

  3. backtick Says:

    Hello, RSnake. I actually stated in the thread that the example was oversimplified and that it’s not a complete alternative to a server-side filtering solution but more of a second line of defense. I didn’t think about talking to Microsoft or Mozilla security team about it. I had some contact with both of them before while in the process of researching a Drupal vulnerability but feel free to use my email for any contacts you want to pass.

  4. RSnake Says:

    Before I do, have you already looked at this?


    It occured to me after I wrote this that this is really similar to what Gerv has already been researching. I actually proposed something very similar to both MS and Mozilla a few years back, and Mozilla felt it was horribly complex and suggested using iframes. In Microsoft you can use <iframe security=restricted to turn off scripting inside the image, but if you follow the thread, we’ve actually tested that with mixed results (namely resizing the iframe to the appropriate size of the content without using a scrollbar).

    Dean and I discussed another issue that falls into this around CSS clobbering at the bottom of this thread that’s relevant as well: http://ha.ckers.org/blog/20060609/breaking-out-of-html-constructs-for-cross-site-scripting/

    It might be good to discuss those before we consider alternatives, as there is probably a lot of overlap.

  5. backtick Says:

    Hello, RSnake. I’ve rea almost all references in blog entries you pointed me to including both Grev’s paper on content restirctions and his follow up on script keys. We are revolving around the same axis that’s client-side enforced restrictions but my proposal for the implementation is, i think, a bit simpler and easier to use by web developers.

    As a modification for my original proposal, the attribute or role value might be implemented to selectively allow script execution within some elements. In other words, the browser, by default, will not allow script execution except within elements that “explicity” allow it with an attribute like allowcode=”ture”. I think that might be more safer yet not as, to some extent, complicated as Grev’s proposal.