David Naylor (a semi-reformed SEO Blackhat) has an interesting writeup on how to stop badly behaving robots from spidering your site. I would hardly call this technique new (I’ve seen this scripts in one form or another for nearly a decade). However, it’s a good primer for anyone who runs a big website and who is otherwise powerless to stop people from doing it.
This technique doesn’t just work on robots though. Often people during manual assessments will look at the source code of a page and if they find hidden links or commented out pieces of code they will follow them, hoping to find something interesting from a security perspective. One alternative is to trap them and either put them into the matrix or ban them or otherwise log the activity. David’s article is worth a read if you are unfamiliar with how this stuff works.