Paid Advertising
web application security lab

Archive for the 'SEO/SEM' Category

Clickbot.a Writeup

Friday, April 20th, 2007

I was sent this link today on Clickbot.a written by the Google adwords guys. It’s a pretty interesting high level read for the most part, if you don’t know much about click fraud, but does get into some of the technical stuff near the end on how the bot actually worked. While the conclusions of the paper are fine, I was struck that the authors failed to address the most important point.

The most important point being the only reason this bot existed, and the only reason the hackers used it to compromise 100,000+ machines - because it was economically lucrative to do so. That means Google’s detection was too slow to respond to and prevent the attackers from making enough money to make it worth their while. Also, it was at the expense of the advertisers as well as the poor web sites who were compromised for this purpose no less. Which means that Google’s detection methods need to improve to not just pick up this particular variant but also polymorphic versions that are far less easy to detect. So while it is commendable for Google to fix this one issue, it shows they are lacking the technology to pro-actively defend against future and less immature variants.

While Google’s executive management feels that economics will solve this issue I feel that Google is failing to see how detrimental this is to the advertisers who depend on quality click traffic. In lieu of this quality, alternative solutions must be in place to allow advertisers to recoup their costs while Google struggles to build new technology to defeat the issue. However, without access to the actual landing pages that the advertisers use, Google cannot have deep insight into the full picture. Ultimately, this will cause a bigger rift with time that the attackers can exploit on the vast majority of sites that don’t use alternative click quality tools. Until the time when Google can come up with a creative solution, companies like Click Forensics fill that void.

Spider Trap For Stopping Bots

Friday, April 20th, 2007

David Naylor (a semi-reformed SEO Blackhat) has an interesting writeup on how to stop badly behaving robots from spidering your site. I would hardly call this technique new (I’ve seen this scripts in one form or another for nearly a decade). However, it’s a good primer for anyone who runs a big website and who is otherwise powerless to stop people from doing it.

This technique doesn’t just work on robots though. Often people during manual assessments will look at the source code of a page and if they find hidden links or commented out pieces of code they will follow them, hoping to find something interesting from a security perspective. One alternative is to trap them and either put them into the matrix or ban them or otherwise log the activity. David’s article is worth a read if you are unfamiliar with how this stuff works.

Hacking Matt Cutts - Death By 1000 Cutts Case Study

Tuesday, April 3rd, 2007

About once a month I get someone asking me why knowing what users are running is useful. People don’t seem to think reconnaissance is worth doing these days. I’ve heard people say things like, “Just try the attack and see if it works.” While sometimes it is totally worth just trying the attack in un-targeted attacks there are circumstances where that’s just not true. The first circumstance is where the attack takes a prohibitively large amount of resources. The second is where the attack leaves a big signature when it runs and you want to minimize that signature. The last, however, is the most interesting. The last is where I want to hack a single user, and I want to make it work the first time without fail. This is where recon is useful.

So I decided to pick a user out of the tens of thousands of people who have visited my site. As you all probably know by now, I’ve never been on super great terms with Google - it’s a long story that I’ll rant about over beers to almost anyone who asks. The point being I represent what we like to call a determined attacker. Not so much that I want to hack Google directly - that’s easy enough, but calling out their unofficial technology spokes person while making a point about how important recon is to web application security is the best of both worlds. So I picked Matt Cutts who runs the web-spam group at Google and who happens to be the person that SEO Blackhats most love to hate.

This case study has taken me a few months to put together, and I was thinking about releasing it at a conference at some point but why wait? I think it’s worthwhile to release it now before the noise of Bluehat, Blackhat and DefCon is upon us. In this case study that I’ve entitled Death by 1000 Cutts (as a jab at my own original case study entitled Death by 1000 Cuts) I take a series of extremely minor information disclosures in various ways to mount a really nasty attack where I steal files directly from his machine using anti-anti-anti DNS pinning against Google Desktop. Rather than type the whole thing out again, I encourage you to read it for yourself. I hope this at least partially puts to rest people’s resistance against recon and proves why recon is a powerful tool in a determined attacker’s arsenal.

Windows Live Italy Being Used Maliciously

Tuesday, March 20th, 2007

Zach sent me a link to a hackin the box article about how Windows Live is being used by blackhat SEO (search engine optimization) to bring malware links to the top of the search results. This marriage between blackhat SEO and hacking is starting to take off. It’s unclear what tactic they used to get to the top of the search results, but clearly, it worked, as they ended up taking over quite a bit of Live’s Italian site.

Once the users were on the Live.com site apparently they were served up links to malware sites. The search engine itself was used as a conduit for sending people to the malicious search pages. This is yet another reason why search engines shouldn’t index XSS. Even if the site is benign, they would be indexing links to malicious pages on benign sites. Anyway, interesting read, and it’s scary that the SEO community is now dabbling in hacking as well. It was only a matter of time.

Google Announces Invalid Domain Through Blacklisting

Thursday, March 1st, 2007

Click fraud is a big deal (Google claims it’s as low as a few percent but other leading industry experts disagree and put it much higher). I was actually fairly impressed that Google not only acknowledged the problem but is actually taking steps to prevent it that are visible to the consumer. Google announced a blacklist for domains that advertisers feel are highly likely to commit fraud. I kinda like this concept but like everything the devil is in the details.

Firstly, is an advertiser going to be able to block specific URLs, specific domains, specific URLs with keywords, or is going to be ultra high level like a “by category” type system. And how will the bad guys subvert that? I think we all know how poor blacklisting works even with something as fine grained as HTML, let alone entire classes of sites of the Internet. Also, what will happen with an advertiser who blacklists a domain while other people are visiting that exact domain? Will the banner magically disappear? I think not, so what happens if someone clicks on that link? Has the website taken a risk that the advertiser can turn off the link at will and refuse to pay (allowing for advertising fraud), or will Google force them to pay regardless?

Ultimately, I don’t think the solution to Google’s click fraud numbers has anything to do with blacklisting. It’s a neat consumer feature, and may give them some small clout with advertisers who ask for this sort of thing all the time, but really, it’ll make next to no dent in the overall fraud numbers that Google sees (at least that’s my prediction).

Hacked .EDU Sites Used For SEO

Saturday, February 24th, 2007

I’m sure this is old news to some people but it’s the first time I’ve seen it show up in my logs before. In the last twenty four hours three different hacked .edu domains have shown up in my logs. Stanford.edu, UCNE.edu and ISI.edu have all been at least somewhat compromised where the domains now host spam sites. Not so good.

Clearly the administrators of their domains have got some work to do to secure their sites. But it does cast some doubt on the “good” and “bad” domain concept. When a good domain goes bad, is it breakout (intentionally getting a good reputation and then converting to be bad) or is it spam? Either way, it’s clearly bad, but what to do about it? Do you blacklist the pages or the whole domain? That’s gotta make life a little harder for the search engines that try to stay away from spammy domains. Perhaps reputation and link popularity is a bad model afterall.

Adblockplus Workaround

Monday, February 12th, 2007

I’ll probably regret this post at some point, and I have to caveat this by saying I love adblockplus (it’s a dream). However, it is also flawed. Whenever you do straight string compare you are risking missing something. Well it just so happens, that the string comparison required when you are looking up something like ypn-js.overture.com you are missing one obvious way the client can request the JavaScript from the page - using the IP address. But that alone isn’t magic. Anyone can swap out an IP address… and by the way, that alone won’t work because of the way Overture’s ads are built. Not only do they use ypn-js.overture.com for the initial JS lookup, but also for the subsequent iframe that contains the ads themselves.

Okay, easy enough… first we take the JavaScript and look for any variables that are set by the Overture JavaScript. We find one and then we check to see if it has been set. If it has, you can see that the ad is already there. If it hasn’t, the ad is not there, and you can write your own work-around. The reason we do this in this order is to make sure we don’t end up with two ads on the page (and we’d rather use the DNS if we can since that has built in IP failover).

Here’s the demo. This could be very valuable to anyone who is plagued by their users who turn off ads in the SEO/SEM crowd. Hint, hint, whitelist this domain, so I don’t have to mess with you guys. ;)

Malicious SERP Arbitrage Lessons

Friday, January 26th, 2007

I spent the better part of my free time for today putting together a rather sophisticated search engine result page arbitrage tool. No, I won’t release this one. Partly because it sucks, partly because it requires that I allow other companies to run JavaScript on my domain, partly because it requires redirection, and partly because it’s easy enough that anyone with enough skill could do it themselves anyway. The point is I did it as a demonstration for a potential client, and there are some lessons learned. This is pretty nasty tool for blackhat SEO/SEM types.

If you don’t know what I’m talking about, it was an old trick I talked about revisiting, which is making the back button on browsers change functionality (popups or redirection to other sites). Only my version actually mimics the search engine the user came from.

1) The arbitrager must understand that the user is coming from a search engine. There is more than one search engine. So they must code for each one that they want to steal traffic from. This alone can be a bit of a nightmare.

2) The traffic arbitrager must detect which links the user has already been to so to make sure that if they are to click on those links again they end up on the page they meant to go to. This will make it fare less likely that the user will notice the trick. This is harder than it sounds because each site has a different style for the a:visited tag. And btw, case matters if there are any letters in the color (EG: 4e4e4e is different than 4E4E4E4E).

3) The site must take into account crazy JavaScript and Style sheets that all the search engines SERPs add to each of their pages. That can really mess things up (and definitely change the layout slightly). I defaulted to the no JS view that looks close enough - less likely to cause errors.

4) The first time the user visits you need to redirect them to the original page they meant to go to in question. If the cookie (state) exists where they have already been there, when they hit back they see the fake SERP.

Ultimately I think this is a pretty powerful tool for malicious arbitragers. Pretty nasty, actually.

Alexa Fallacy - As if Anyone Thought Otherwise

Thursday, January 18th, 2007

Okay, no more theories, no more guesswork, I finally have proof that Alexa data does not jive with actual real internet traffic patterns. Well, at minimum it doesn’t match what they claim it matches - it does prove other interesting facts, but I’ll get to that in a minute. My Alexa rating is pretty high. Ha.ckers.org does tend to get quite a bit of traffic (somewhere around 11k-14k unique users a day visit the site). Most of my traffic is comprised of the security industry, but I do have quite a few SEO readers - especially the blackhat SEO crowd. That comes from a long standing bridge between web application security and SEO and it also happens to be that I’m one of the few security people who talks about both. For whatever reason I have a lot of webmasters who aren’t particularly interested in security who read my blog as a result of that bridge.

One thing that most SEO people (and indeed webmasters in general) have in common is that they happen to all have Alexa (or here for Firefox users) installed on their browser. It could easily be seen a spyware because it does report on where you are visiting but it also gives you the relative page rank of the page for your troubles. This can easily help you assess if a site is new, or if it’s old, if it’s got a real following of people, or not, etc… It can also tell you if the domain gets traffic to other cnames (if that’s interesting).

But there’s been a long standing theory that Alexa data does not actually indicate true ranking. Finally I was able to prove it (at least to myself - maybe other people proved it to themselves before now, but I haven’t seen any hard and fast stats until today). So here’s what the current graph of my Alexa ranking is over the last several months:


Click to enlarge

You’ll notice the two biggest spikes on July 30th and January 16th (just a few days ago). So you would naturally assume those are huge spikes in traffic, right? Well let’s look at the significant events of those two days in particular. On July 30th 2006, the site was Slashdotted. To most people that’s probably a pretty significant event and you’d expect to see a huge jump in traffic. Let’s look at what it really did:


Click to enlarge

You can plainly see very little traffic change for the 30th. Sure it was up a great deal for the average weekend, but it was really not much of a spike, and nothing like the traffic levels you’d expect for the 4,000th biggest website on the planet, right? Could it be that SEOs also tend to read Slashdot? Okay, that’s really just a theory, so let’s put it aside for now. Now let’s look at the events of just a few days ago (the 16th).


Click to enlarge

You can see that I did get a fair amount of traffic but it was nowhere near my highest day this month and I certainly didn’t jump up by more than a few percent of my normal traffic load (the 17th proved to be a much higher traffic day and the 8th was the day the firewall died on us). Where did all the Alexa traffic come from that made next to no increase in the number of users we get in a day? Well as you will remember, just a few days ago a self proclaimed Whithat SEO said he intended to hack a bunch of sites. Ha.ckers.org was named in that list of sites (despite the fact that I really do not consider ha.ckers.org to be an SEO website). Ha.ckers.org was linked to by his website, and many other SEO’s picked up the list as well. In essence, every SEO that would possibly click on a link to a site named “ha.ckers.org” did click - and all in one day. Thus you can see only a minor increase in traffic on the 16th but a huge spike in Alexa ranking.

Quod erat demonstrandum.

It is interesting to note how many SEOs use the Alexa toolbar. I bet that database would give away a lot of SEO secrets if it were ever compromised. Spyware never sounded so good.

Someone Wants to Hack All Big SEO Sites.

Monday, January 15th, 2007

Someone named Fuckingpirate posted a very new blog today stating that he intended to hack a lot of the biggest SEO sites out there. Funny that I am somehow considered one of the biggest SEO sites since I rarely post about SEO (yes this is the second post today on it, but today has been the first in months).

Edit: Site is already down… so much for that game!

Edit: Site has moved to blogspot and there is a copy of the old site here. Additionally wolf-howl has been compromised.