Cenzic 232 Patent
Paid Advertising
web application security lab

Username Based Primary Key Issues

Okay, I had more trouble naming this post than I did writing it so bear with the title, because I think this is actually a pretty interesting problem. I went out for cigars with a few uber old school hackers from a huge company last night and we got to talking about databases. I’m not exactly sure what lead my brain down the path, but I suddenly realized there’s actually some fairly tricky issues around data deletion/retention that can lead to sensitive information disclosure. Here’s how it works. Let’s say you have three people, Alice, Bob and Cathy. Alice and Bob are both good users of a web site message board. Maybe they have special privileges, or maybe they are just communicating privately and Cathy wants that information.

In a secure database setup, if a user is deleted from a system no other user can take over their information after they are gone from the system. It’s not deleted (for data integrity/forensics, etc…) but it should be inaccessible to users. However, it turns out that not all database structures are made equal. Sometimes people use usernames as primary keys, or hashes of usernames, etc… instead of a non-reputable number. So let’s say Alice had her account deleted (intentionally or accidentally) there is an opportunity for Cathy to create a user account with the same name as Alice’s username and suddenly gain access to the same information that was supposed to be accessible only to Alice.

Let’s take this further and bring back an old favorite and normally not all that interesting attack - DoS by CSRF - that is forcing a user to click on a “delete my account” link. Typically DoS isn’t all that interesting to me, but here’s where you can actually use it to take over the account. This all relies on knowing the username of the person you are attempting to compromise. So let’s say Cathy creates an account on the message board. On a forum she places an image link to http://www.evil.com/x.php?.jpg which looks at the referring URL before making the decision on whether to show a benign image, or redirect the user’s browser via 301 or 302 redirect to the delete user function (eg: http://www.good.com/deluser.php). If the referring URL contains the username “Alice” the Alice user’s browser automatically is redirected and the account is deleted.

At this point Alice is probably almost immediately aware that her account has been deleted, so it is time critical that Cathy immediately registers with the same name “Alice”. Assuming the database uses the registered user as the primary key (or some derivation of the user name) it is possible for Cathy to now assume control over Alice’s account. Now she can read Alice’s private communication with Bob, or if Alice had some administrative controls, it might be possible for her to do other nefarious things, like promote other user’s access, etc…

There is an alternative to trying to detect Alice’s name in a referring URL. One example is if images can be sent within private messages or other out of band communications, assuming Alice is already logged in it will have the same net effect of targeting the individual user. It’s also fairly easy to detect this in a blackbox situation simply by creating an account, using all the functions, deleting it and re-creating the same name to see if any data was retained during the creation of the account. The point being, it’s bad form to use a username as a primary key unless it’s forever off limits for future users attempting to re-assume that username. Kind of an esoteric problem, but as I talked with the gurus, at least one of them had actually seen this happen in his own penetration tests against some very large well known companies. So there you have it, it’s not even just a theoretical attack. Yet another thing to look out for and at least one good reason to protect account deletion functions from CSRF.

19 Responses to “Username Based Primary Key Issues”

  1. Robert Says:

    Have you actually seen a real world example of this?

    - Robert
    http://www.cgisecurity.com/

  2. Top8 Says:

    Hi, rSnake, this is off-topic, but I want to ask your opinion of this
    http://top8corp.blogspot.com/2008/03/idea-for-ad.html

  3. Andi Says:

    Hi,

    I think this can be an issue if you are a BAD database designer ;)

    1. For performance reasons you should never take a username as primary key, but you can add a unique index to it.
    2. If you use ints as pks, the user table will auto_increment the value and you will never have the case that anyone can create a new profile for your old name after you have deleted (or flagged as deleted) your profile.
    3. To go further you should obfuscate auto_increment ids in APIs and all interfaces that are public.

    Can you give me an example of a site that really uses user names as keys to make critical CUD actions? - What you describe here is a good idea for us hackers ;)

    Andi

  4. RSnake Says:

    @Robert - not me, no, I just came up with it last night. But as I said, at least one of the guys I was talking to last night had in several different systems.

  5. RSnake Says:

    @Top8 - that’s pretty funny. I’m flattered that I am the new bad guy to be thwarted. Fortunately I don’t wear a black hat so I don’t stand out too much in crowds. ;)

  6. Rogan Dawes Says:

    So, if you are intending to keep items for a particular userid, even after that userid is no longer active, why on earth would you you delete the userid? Simply mark it as “deleted”/”inactive”/whatever, and you are done.

    If, as you say, this actually happens in the real world, the designers of those applications clearly didn’t think about their requirements very hard. What is the value in keeping history for a user if that user can be deleted? You may have some history, but you don’t know anything about that user!

    What’s the point of that?!

  7. donwalrus Says:

    Its an interesting topic, and i’ve actually seen this type of (poor) DB design at clients in the past. *Usually* I see email addresses being used more frequently as primary keys, rather than usernames (which is *slightly better*), but –depending on the situation–really doesn’t add too much complexity….of course that sort of attack may relate more to your latter post here: http://ha.ckers.org/blog/20080305/changing-email-addresses-for-spam/
    regarding changing email addresses of valid accounts….but nonetheless

  8. Jim Manico Says:

    This is easy to solve. Like Rogan is suggesting, just do a “soft delete” of the account. Or better yet, do what almost every major website does - require email confirmation.

  9. mckt Says:

    My company’s internal database was designed that way originally. Rewriting the thing was the first thing I did when I started working there.

    I’ve seen it done that way a lot in other applications too. I know of at least one meta-programming application that does this, though I’m pretty sure nobody uses it anymore (it was dated when I ran across it 5ish years ago)

    Actually that one kinda surprised me- you’d think if you were building an application that builds applications, you’d be abstracting things out as far as you can.

  10. maluc Says:

    To add to what Andi said, a good designer should also always:

    1.) Map the foreign keys that relate all the tables to one another. (set them as constraints)
    2.) Set those child tables to be automatically deleted when the parent table is deleted. (adding ‘on delete cascade’ to constraint)

    Example in postgres, but should be standard:
    Tables:
    users (”id”,”username”,”password”)
    messages (”senderId”,”receiverId”,”message”,”timestamp”)
    *the messages should automatically remove a row when either the sender or receiver deletes their account*
    SQL to add:
    ALTER TABLE messages ADD
    CONSTRAINT “$1″ FOREIGN KEY (senderId)
    REFERENCES users (id) MATCH SIMPLE
    ON UPDATE CASCADE ON DELETE CASCADE,
    CONSTRAINT “$2″ FOREIGN KEY (receiverId)
    REFERENCES users (id) MATCH SIMPLE
    ON UPDATE CASCADE ON DELETE CASCADE

    That should solve the issue even if the user ID (or username if that was the one referenced) was reused later.

  11. yawnmoth Says:

    Being able to assume someone else’s privileges in this manner extends beyond just a single database.

    Say you register for a message board with a hotmail.com account. After three months of inactivity, hotmail.com will delete your account at which point someone else could (I assume) take it over. They’d then be privy to all the messages you were.

    They could reset your password, get notifications of new private messages, or whatever.

    Even worse… say you’re at a university and used a *.edu email address for a bunch of online stores. Then you graduate, lose your *.edu email address, at which point, someone else could come along and take your old one. Assuming the original owner didn’t change his email address on all those stores (a reasonable assumption, imho - do you know every single online store you’ve ever created an account with?), you’d be able to make purchases on their credit card, maybe even get their credit card #, etc.

  12. yawnmoth Says:

    Eep. Just now saw your next post, which address this…

    Oh well.

  13. yawnmoth Says:

    Actually, never mind… I just read the first sentence and assumed it discussed the same thing, heh.

  14. antonio Says:

    For years an ISP who’s name I won’t disclose used a billing system that was keyed off the customer phone number. Turned out that if you ever changed your phone number and notified the ISP - they’d update their records in billing which immediately ‘orphaned’ the account. No further bills would be sent to that customer.

    A couple thousand people enjoyed free internet services until it was discovered.

  15. crazy_lil_white_guy Says:

    I think the primary concern here is the non-standardization of the data access model when dealing with web applications (or any app for that matter). All applications need some sort of data access, and there are 2 primary methods for this access (and each has their issues).

    The first is every application user has an associated database account, where data access is granted by way of database permissions. This is the lowest common denominator for data access, does the user have the appropriate permissions to access the dataset they are requesting… 1 or 0… and yes, that can be bypassed by a number of means if the there are issues within the overall logic of the application, issues with the underlying schema of the database, database security, direct DB access… blah blah blah… that’s another topic.

    Then there are the applications that use 1 database user and the application logic determines the permissions to tables, views, stored procedures, triggers, etc… And of course that is a rock solid model :S Sharepoint is a prime example of a application who’s entire security rests on the integrity of application level security. Essentially you have 1 database user, who is the grand pumba of the data, it grants access through application logic to what data users can access… pop the app… you pop all the data.

    So now you have 2 data access models, each with their own set of issues… how do you address these issues?

    Does my RFC 1918 address traverse to the internet? No

    Why is it that the primary mechanism for data storage, retention and access (all the shit ppl are worried about) not standardized?

    Nothing is perfect, but data access is completely whack, and is left up to developers that don’t understand security to enforce it.

    If there was a standard, people would break it… which is a good thing, because you can develop a common solution.

    I understand that that applications needs to be able to accommodate authentication and data access for both publically accessible applications, as well as internal applications, and each has their own dependencies. But really… if the people that are developing these databases and applications are so smart, shouldn’t a common model for data access exist?

  16. AviD Says:

    A slight variant of this, which I have actually seen in several OTS products that I’ve reviewed:
    Business systems that use Windows-based authentication often keep an internal reference to the authenticated user. If that reference is based on the username alone (which, as I’ve mentioned and I’m sure you’ve seen, is often the case…), a user in another domain with an identical username (sans the domainname) can take over the user’s privileges, and identity in general.

    Besides the fact that this could happen accidently - either a user from another domain, or after the user is deleted and another user is created with the now-available username - a malicious domain admin can easily create specific new accounts with the same username as system administrators (sometimes this is Administrator!). Note that the evil admin has his own domain, not the main corporate domain…
    Obviously, the system should authenticate the user against domain\username (or username@domain.com…), and better yet store the user reference as a SID. Though this is not often done….

    And, as long as we’re on the topic, Windows-authentication-based apps are especially vulnerable to CSRF, anyway ….

  17. AviD Says:

    Just wanted to clarify, the difference is that in this case its not necessarily the DB design, rather the authentication implementation.

  18. Duncan Says:

    I know this is an old post, but I wanted to comment on this:

    “I think this can be an issue if you are a BAD database designer ;)

    1. For performance reasons you should never take a username as primary key, but you can add a unique index to it.”

    I have to disagree. It’s bad database design to use a numeric surrogate key when you have a good natural key available for use as a primary key- this is, in fact, a fundamental principle of relational database design. Database designs where everything is keyed off auto-incrementing counters are a good sign of an inept database designer. Now, there may be some circumstances in which you’d want to denormalize like this for performance reasons, but for the most part a good db should be able to handle short character strings as primary keys without losing all that much in the way of performance. There are some other reasons to use surrogate keys that are, to my mind, better justifications- in particular, if you are using an ORM tool that requires it, you have little choice in the matter. And if the only natural key you can find is a composite over 4 fields, pragmatism probably demands that you use a surrogate key. You should at least make sure that you have a unique constraint on a natural key though, if the table can be normalized at all.

    In the case in question (a message board) I would probably not want usernames to be recyclable. I might also want to be able to reinstate a user at a later date, and I probably wouldn’t want their messages, PMs they might have sent other users, etc., to vanish as if they had never existed. Assuming that I don’t want to allow users to ever change their username (and I, at least, likely don’t) it seems to me that in this case the “soft delete” mentioned above is the correct thing to do. If I do want to allow them to change usernames, that complicates things a bit.

    In other cases you might want to delete all information associated with the account using constraints in the db, as maluc says.

    Anyway, I think this whole “always use numeric indexes as primary keys” thing is pernicious. There are times when it’s best to use surrogate keys, but it’s important to understand that surrogate keys are at best a necessary evil. They are never a positive good.

  19. aa Says:

    when i delete references of a primary key , the primary key should also be deleted .. how can we do that