Paid Advertising
web application security lab

Googlebot, Yahoo Slurp and MSNBOT Mapped

I was toying around with Google Cache a few days ago and I ran accross an interesting thing. Google cache had searched my environmental variables display script called log.cgi. Log.cgi really doesn’t do anything but output any variables that the client sends to me. In this case it was Googlebot’s environmental variables.

Then I did a little more research and found that Yahoo Slurp too had logged itself. Then taking it one step further, I found MSNBOT had found me as well. Fairly amusing to look at how the various robots gave up everything. This is the first 100% definitive proof of what the various robots signatures are that I’ve ever seen. Everything else has been subject to spoofing, but this is 100% correct, and now I have proof. Anyway, it may be interesting for the SEO IP delivery community like Kloakit. I’ll be talking more about this at a later date.
In case they get removed, here are the signatures:

Googlebot:

HTTP_ACCEPT = */*
HTTP_ACCEPT_ENCODING = gzip
HTTP_CONNECTION = Keep-alive
HTTP_FROM = googlebot(at)googlebot.com
HTTP_HOST = ha.ckers.org
HTTP_IF_MODIFIED_SINCE = Fri, 09 Jun 2006 02:43:40 GMT
HTTP_USER_AGENT = Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
REMOTE_ADDR = 66.249.65.107
REMOTE_PORT = 50637

Yahoo Slurp:

HTTP_ACCEPT = */*
HTTP_ACCEPT_ENCODING = gzip, x-gzip
HTTP_HOST = ha.ckers.org
HTTP_IF_MODIFIED_SINCE = Thu, 22 Jun 2006 03:33:16 GMT
HTTP_USER_AGENT = Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
REMOTE_ADDR = 68.142.250.77
REMOTE_PORT = 60560

MSNBOT:

HTTP_ACCEPT = text/html, text/plain, text/xml, application/*
HTTP_ACCEPT_ENCODING = identity;q=1.0
HTTP_FROM = msnbot(at)microsoft.com
HTTP_HOST = ha.ckers.org
HTTP_USER_AGENT = msnbot/1.0 (+http://search.msn.com/msnbot.htm)
REMOTE_ADDR = 65.54.188.105
REMOTE_PORT = 3319

Leave a Reply Or Discuss On the Forums