Cenzic 232 Patent
Paid Advertising
web application security lab

Abuse Alexa

Alexa is spyware. Alexa also provides an interesting service to do website ranking. Quadzilla even ran an article about how Alexa can be used to help Google index your website. I haven’t confirmed this claim, but it’s an interesting premise. Using other conduits to spider your site for you, is pretty fascinating. Now, in the spirit of full disclosure, Quadzilla’s links to Alexa are including a referal link (presumably to get some cash out of the deal) so I’ll probably end up verifying this claim at a later date to ensure it’s on the up and up (not that I don’t love Quadzilla, just that blackhat is as blackhat does).

Anyway, it got my brain spinning. As blackhat as Quadzilla is, I’ve got trickier things up my sleeve. First, let’s just make the assumption that he’s right and that it does help pagerank, which ends up driving more traffic. Alexa is still spyware and you probably don’t want it installed as his article suggests. Sure, you could throw it in a VMWare session but then you have to have a VMWare session set up just for that so it won’t take over the rest of your computer. So what are some alternatives?

Well unfortunately Quadzilla’s link is about a year old, so the link to the software that he suggested downloading as an alternative is long since gone. So I decided to do a little spying on Alexa to see what it does. Here’s what a normal Alexa packet looks like (click to enlarge):

Alexa HTTP session

Now let’s take a close look at this. It contains a link that I happened to be going to (the SEO section of my blog) and a bunch of variables. It also contains a User Agent that has the word Alexa in it. It contains a variable called twym65 that looks an aweful lot like a unique identifier, as well as a cookie. Well, let’s take a look at a couple of those GET requests side by side (I cleaned this up to remove the URL encoding):

  • /data/sfqA41hH4oY0zp?cli=10&dat=snba&ver=7.0&cdt=alx_vw=20&wid=30040&act=00000000000 &ss=1024×768&bw=1000&t=0&ttl=6407&vis=1&rq=8&url=http://www.google.com/
  • /data/sfqA41hH4oY0zp?cli=10&dat=snba&ver=7.0&cdt=alx_vw=20&wid=30040&act=00000000000 &ss=1024×768&bw=1000&t=0&ttl=3703&vis=1&rq=9&url=http://www.yahoo.com/

Well, I have a sneaking suspicion that Alexa gets spammed a lot. Here are some of their countermeasures. Firstly this includes the screen resolution in the variable “ss”, Secondly my bet is that rq is the request count since it increments. The TTL (time to live?) appears to be changing, so that could either be the time since the last request, or the time that the user visited the page, or something else (that would require more testing). The URL itself appears to be pretty unique, so that is probably unique per visitor. So it’s probably not worth attempting to spoof an Alexa toolbar without having already downloaded it. So what are our other options?

Well, the main goal here is to get Alexa to see more of your pages. Again, we are going on the assumption that Alexa helps page rank (something I am still unsure of). If that’s true, all we need to do is to get Alexa to see those pages. How about if we use Alexa against itself. Since it’s spyware it leaks a lot of information. One thing that doesn’t change depending on where it connects to is the User Agent. The User Agent prominantly declares that the user has Alexa installed. Okay, so what are the requirements that Alexa uses to determine if one of it’s spyware packets is to be sent?

Here’s what I was able to find. Alexa does not follow iframes or 301 redirects. By all appearances it appears that Alexa only looks at the parent frame once it has finished loading. That’s slightly harder to mess with, but not by much. I could easily build a script that would identify an Alexa user (by the User Agent) and send that user via a 301 or JavaScript refresh or whatever to another one of my sites. Okay, so they hit another one of our sites. How is that useful we lost pagerank on the site they hit first, right? Well, that’s only assuming we don’t redirect them back after a certain amount of time. Alexa doesn’t care what is actually on the page. If they page is 100% dynamic it has no way of knowing that. If you bounce them from one page to another after the page has completed it’s loading Alexa will register BOTH as hits.

Okay, but then the user is on another site, and isn’t seeing the content they wanted to see. Again, that can be dealt with by watching the referring URL. If both are your sites you should know what the content was on the referring page. You can either display it on the other site, or you can redirect them back again to the primary site after the page has completed loading (delay in the JavaScript refresh, or a meta refresh after a few seconds, etc…).

The point being you can easily use Alexa against itself and increase your Alexa pagerank across many many sites without actually ever installing it once.

3 Responses to “Abuse Alexa”

  1. Quadszilla Says:

    That was about a year ago and the (alexa generated) target pages don’t apear to have the same effect in Google. So they may have fixed that aspect of it.

    Even when I put that toolbar Link in I never thought anyone would use it to shop at amazon (and no one has!). The only reason I bothered setting that part of it up was to have an SEO Blackhat link the browser toolbar.

  2. RSnake Says:

    Haha… that makes more sense… I knew there was more to that story. Interesting. Thanks for the reply! I’m not sure if this hack has any merit in that case (if Google has indeed fixed this) other than increasing your Alexa rank to potentially make it a better domain to sell. I’m not in the domain selling business, but I’m sure it’s valuable to anyone who is.

  3. yeah Says:

    AID=sfqA41hH4oY0zp
    DateTime=2006-6-15 18:15:56
    IP=209.74.96.60
    if you are intrested in alexa rank,contact me.