Cenzic 232 Patent
Paid Advertising
web application security lab

Spider Trap For Stopping Bots

David Naylor (a semi-reformed SEO Blackhat) has an interesting writeup on how to stop badly behaving robots from spidering your site. I would hardly call this technique new (I’ve seen this scripts in one form or another for nearly a decade). However, it’s a good primer for anyone who runs a big website and who is otherwise powerless to stop people from doing it.

This technique doesn’t just work on robots though. Often people during manual assessments will look at the source code of a page and if they find hidden links or commented out pieces of code they will follow them, hoping to find something interesting from a security perspective. One alternative is to trap them and either put them into the matrix or ban them or otherwise log the activity. David’s article is worth a read if you are unfamiliar with how this stuff works.

4 Responses to “Spider Trap For Stopping Bots”

  1. Kishor Says:

    Hey, if I visit
    http://fusion.google.com/add?feedurl=http://www.davidnaylor.co.uk/badrobots.html

    won’t google get blocked?

    Same with google translate etc.

  2. RSnake Says:

    Yes, that is exactly correct, and a good point. I never do this type of blocking for that very reason. It is incredibly easy to abuse.

  3. kaes Says:

    not to mention you could easily do a CSRF image on some large forum or related blogs and make the site ban a large part of its intended audience.

  4. DaveN Says:

    Kishor, It’s only a Basic attempt to stop bad bots, you might want to setup a Whitelist of bots, and a cron job to output what IP’s you caught and decide if they are blacklisted or whitlisted.. just a thought :)

    DaveN