Cenzic 232 Patent
Paid Advertising
web application security lab

Fixing XSS Can Cause Command Injection

Kishor sent me a link to a recent post he wrote as a follow up to my previous post about how forgetting global replace can cause XSS. What he talks about is how doing something as simple as turning HTML into it’s equivalent entities can cause command injection. This is yet another reason why modifying content is a dangerous proposition.

Kishor notes that changing < into &lt; and injected within a string xyx<ls -l will turn into xyx&lt;ls -l which still renders. Obviously I’m not a fan of taking any user input and piping it through a system call but if you have to do it make sure to dump the script through a while loop to ensure that it’s not doing anything you don’t want it to. Something that’s okay for web content isn’t necessarily okay for SQL or commands or any other use. Just make sure you know what you’re doing with the text and don’t just blindly use it.

9 Responses to “Fixing XSS Can Cause Command Injection”

  1. Chris Shiflett Says:

    This seems to be more a case of escaping for the wrong context than anything else. Concluding that the idea of escaping is therefore flawed is a poor conclusion, in my opinion.

    Escaping isn’t about modifying content; it’s about preserving content in a different context.

  2. RSnake Says:

    I think we agree actually. I’m not saying it’s bad to escape, I’m saying it’s bad to escape only in one context while using it in another.

  3. Chris Shiflett Says:

    “I’m not saying it’s bad to escape, I’m saying it’s bad to escape only in one context while using it in another.”

    You’re right; we do agree. :-)

  4. Andy Says:

    Though this begs the question of architecturally whose job it is to encode/decode.

    You’d like to do minimal input sanity checking, but if you actually want a person to edit html via a web interface you’re going to do a lot of crazy converting for no apparent good reason.

    At the same time relying on all downstream consumers of the data to make sure it is safe for their use can expand on the number and type of filters you have to do, but you get filtering/escaping that is much more semantically aware of why/how to escape.

    I haven’t seen any “standard” architectures published for this sort of thing…. have you?

  5. Jeremiah Blatz Says:

    @Andy:
    Generally, in my experience, it’s best to do whitelisting close to the input, and encoding/blacklisting close to the output. This makes a fair degree of sense architecturally, since you’re more likely to know what the valid characters are on the input side (e.g. an email address is “\b[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b”), and you’re more likely to know what the bad characters are on the output side (e.g. single quotes or non-digits for SQL). This is also advantageous in that you’ll often have two different people independently validating the input, which gives you greater depth of defense. Even if it’s not two people, it’s two contexts, which is still better than nothing.

  6. Jeremiah Blatz Says:

    P.S. I know that that email regexp is wrong, it doesn’t work on foo+bar@cmu.edu or other such cyrus/IMAP/whatever addresses. And the blacklist for SQL doesn’t apply to all databases or even all queries. They’re just examples.

  7. Jeremiah Blatz Says:

    (not to mention that “xyz

  8. Jeremiah Blatz Says:

    ….. err
    (not to mention that “xzy*left angle bracket*ls -l” still runs “ls -l” when inserted into a shell call)

  9. Steve Christey Says:

    I’ve been waiting for a while for someone to release a vulnerability that involves XSS and metacharacters, but so far I don’t know of any public reports. See my Bugtraq post “Re: ISA Server 2004 Log Manipulation” from May 2006 for context. I use this as an example of why blindly encoding everything is the wrong approach. Agree with RSnake - you definitely have to keep close track of your context, which argues for processing input/output as close to the boundaries as possible.