Alexa is spyware. Alexa also provides an interesting service to do website ranking. Quadzilla even ran an article about how Alexa can be used to help Google index your website. I haven’t confirmed this claim, but it’s an interesting premise. Using other conduits to spider your site for you, is pretty fascinating. Now, in the spirit of full disclosure, Quadzilla’s links to Alexa are including a referal link (presumably to get some cash out of the deal) so I’ll probably end up verifying this claim at a later date to ensure it’s on the up and up (not that I don’t love Quadzilla, just that blackhat is as blackhat does).
Anyway, it got my brain spinning. As blackhat as Quadzilla is, I’ve got trickier things up my sleeve. First, let’s just make the assumption that he’s right and that it does help pagerank, which ends up driving more traffic. Alexa is still spyware and you probably don’t want it installed as his article suggests. Sure, you could throw it in a VMWare session but then you have to have a VMWare session set up just for that so it won’t take over the rest of your computer. So what are some alternatives?
Well unfortunately Quadzilla’s link is about a year old, so the link to the software that he suggested downloading as an alternative is long since gone. So I decided to do a little spying on Alexa to see what it does. Here’s what a normal Alexa packet looks like (click to enlarge):
Now let’s take a close look at this. It contains a link that I happened to be going to (the SEO section of my blog) and a bunch of variables. It also contains a User Agent that has the word Alexa in it. It contains a variable called twym65 that looks an aweful lot like a unique identifier, as well as a cookie. Well, let’s take a look at a couple of those GET requests side by side (I cleaned this up to remove the URL encoding):
- /data/sfqA41hH4oY0zp?cli=10&dat=snba&ver=7.0&cdt=alx_vw=20&wid=30040&act=00000000000 &ss=1024×768&bw=1000&t=0&ttl=6407&vis=1&rq=8&url=http://www.google.com/
- /data/sfqA41hH4oY0zp?cli=10&dat=snba&ver=7.0&cdt=alx_vw=20&wid=30040&act=00000000000 &ss=1024×768&bw=1000&t=0&ttl=3703&vis=1&rq=9&url=http://www.yahoo.com/
Well, I have a sneaking suspicion that Alexa gets spammed a lot. Here are some of their countermeasures. Firstly this includes the screen resolution in the variable “ss”, Secondly my bet is that rq is the request count since it increments. The TTL (time to live?) appears to be changing, so that could either be the time since the last request, or the time that the user visited the page, or something else (that would require more testing). The URL itself appears to be pretty unique, so that is probably unique per visitor. So it’s probably not worth attempting to spoof an Alexa toolbar without having already downloaded it. So what are our other options?
Well, the main goal here is to get Alexa to see more of your pages. Again, we are going on the assumption that Alexa helps page rank (something I am still unsure of). If that’s true, all we need to do is to get Alexa to see those pages. How about if we use Alexa against itself. Since it’s spyware it leaks a lot of information. One thing that doesn’t change depending on where it connects to is the User Agent. The User Agent prominantly declares that the user has Alexa installed. Okay, so what are the requirements that Alexa uses to determine if one of it’s spyware packets is to be sent?
The point being you can easily use Alexa against itself and increase your Alexa pagerank across many many sites without actually ever installing it once.