Paid Advertising
web application security lab

Archive for the 'SEO/SEM' Category

Buy Diggs and Votes on StumbleUpon

Thursday, January 3rd, 2008

There’s an interesting site called Subvert and Profit where the owner claims to sell diggs and votes on stumbleupon for traffic generation. Selling at $1 per vote/digg the goal is to monetize that traffic through various marketing campaigns or traffic arbitraging. Pretty interesting business model, and at worst it’s against the ToS of the various companies - it’s probably not illegal in any way. Blackhat SEM at it’s finest. It’s really not much different than buying paid links on websites if you think about it.

Some of the testimonials on the Subvert and Profit blog are pretty telling, such as, “the mind-boggling barrage of traffic which comes next, is nothing less than euphoric”. I can definitely agree that the volume of traffic from digg and stumbleupon, as well as reddit dwarfs slashdotting in our experience. Traffic arbitrage is here to stay, as long as the margins stay there. Pretty interesting!

Google Text Ad Subversion

Thursday, December 20th, 2007

There’s an interesting article over at ZDNet that explained that Google’s text ads are getting subverted by trojans on people’s machines to get them to click on other people’s ads. It wasn’t clear what those ads were, exactly, but there you have it. I see this kind of thing as a clear path for future monetization - similar to how bad guys are adding extra form fields into forms via malware to gain more information about your identity. Very clever, and easy to do.

This is different from when Google’s ads were spreading malware but has the same basic purpose. Ultimately getting code on people’s machines is the best way to get control of the machine and ultimately make money off of it via spam, clicks, or whatever else they come up with.

Another Fun SEO Blackhat Spam Tactic

Wednesday, September 19th, 2007

Searching through spam can be fun and annoying all at the same time. I found this beauty in my Wordpress moderation queue and thought it was worth a mention. Here’s a spam URL:

http://search.cnn.com/search?query=site%3Amultisquid.com%20-1995-mercury-outboard-serial-number

If you think about it, it’s a fairly ingenious tactic, using multiple sites to help your SEO. Firstly, they get me to link to a site (typically theirs, but in this case, it’s CNN, who is a trusted domain). Then CNN spits out the results (which would be there if Google hadn’t already nuked this site out of their index). The search engines follow their own results and give them link value. Very clever. No idea if it works or not, but it’s clever.

Facebook Says You Should Not Expect Privacy

Wednesday, September 12th, 2007

If there are any people left who think social networking is a safe place to enter your information I think this is a pretty telling story. Times Online has an interesting article on the latest move by Facebook regarding information that previously was inaccessible to search engines. Guess what? They’re going to make it publically accessible. It’s like people never learn (remind anyone of the AOL search query fun?). Okay - to be fair they said it’s only going to include, “basic details, including names and photographs” available.

I’m not sure I agree that it makes ID theft easier as Keith Reed, of Trend Micro said in the article, but it may make recon easier. I’m not saying that that’s not bad - because it very well may be bad, but it’s not as bad as it definitely could have been. But it’s a slippery slope. This quote caught my eye regarding Chris Kelly, Facebook’s chief privacy officer:

He suggested that internet-users could no longer expect to remain anonymous online, but could control only the amount of information about them that is available on the web.

I’m not going to disagree with the reality of the situation, but is it really okay that a social network takes this stance? Shouldn’t they instead be saying something like, “While we cannot guarantee your privacy, we will do everything in our power to insure our consumers have the highest level of privacy we can provide”? Granted, in the end it’s all about the dollar signs. They need to find a better way to monetize their traffic, and a lot of that means they need more users, they need to use the data they have better, and they need the search engines to start sending them more search engine traffic.

Preventing XSS Using Data Binding

Tuesday, August 14th, 2007

Stefano Di Paola sent me an interesting email the other day. Honestly, it took me a good hour of playing with it before I finally wrapped my brain around what was going on. Using data binding he can make JavaScript attach user content to the page while validating that it does not contain active content. That is, styles are okay, but JavaScript is not. Very interesting. Here’s the demo (warning, not for the technically feint of heart).

Stefano asked me to give my report on the good and the bad. The good is, this is pretty damned good at stopping XSS. It probably won’t stop abuse of styles that position themselves over other people’s content, but it would stop a good deal if not all XSS if implemented properly. That’s the good news (and that’s very good news for most people). Here’s the bad news.

The bad news is that it requires JavaScript to work. If you don’t have JS installed, forget it. That’s bad news for security people, bad news for accessibility, and even worse news for robots who are trying to get contextual understanding of the page. It also forces the bottom of the page to be where the user generated content is. That’s also bad for SEO because it means the most relevant content is at the very last part of the page. Depending on how the page is built and the spider, this may fall off the size limits of the robot. Not good. Lastly, it would reap havoc on lots of those poor web application scanners. They would light up like Christmas trees because there is NO output encoding done. None. Zip. It’s funny to make web application scanners have false positives, but it’s also a pain in the butt if you’re the operator of said scanner. Herein lies one of the advantages of scanners that use built in rendering engines (forgiving any other issues they may have).

So where would this be useful? Think about all those web2.0 applications out there that have to put dynamic content on the page, don’t have to worry about spiders, robots, and need to make sure that what they output is okay no matter what encoding, or any other craziness that users may put in. I’m not advocating being sloppy, and there may be other issues here that I haven’t found, but thus far, it’s looking like a promising technology. Very nice work by Stefano!

Is XSS Good For SEO?

Wednesday, May 30th, 2007

There’s an interesting post over at Venture Skills blog talking about if XSS is actually good for SEO purposes. While I don’t have any conclusive evidence that he is wrong or right (at least nothing that makes me satisfied by saying that is a correct or incorrect assessment), I will say I have seen evidence that blackhats definitely are using this and search engines definitely are indexing them.

I have also heard blackhat say that it works best when used as a “spice” within a mix of a lot of other normal links, rather then relying on them entirely. Again, I have no evidence that that is true or not, but I wouldn’t refute other people’s experience without evidence. One thing I think is important to mention is that XSS as it stands is NOT good for SEO, nor could it be. What blackhats use is HTML injection, not JavaScript injection. Also, it should be noted that XSS takes on three forms, only one of which is almost hopeless for a search engine to prevent and that is stored XSS. What I will say is it should be pretty easy for search engines to set up rules looking for commonly used reflected HTML injection techniques and devalue them.

Anti-Splog Evasion

Monday, May 21st, 2007

I know I’m really going to kick myself for this one, as it will no doubt come back to haunt me, but I’ve been thinking about this one for a long time. One of the things that Blackhat SEO types do is they attempt to scrape other people’s sites that have original content (such as mine). Then they post that content on their site as their own, attempting to raise their own page-rank. Because the search engines aren’t smart enough to know who is the original author, the sploggers get higher in the page ranks.

One of the tactics to evade them is to deliver unique content to them (a one time token or something of the like) that allows them to see the content, but if they attempt to replay it, the webmaster can tell who it is by going to their lookup table and seeing who scraped them. Often times you can shut them off at the source or do something more evil like I did. But there’s a way around it.

If you click on the image you can get an idea of the concept. The concept revolves around using more than one scraper (which is not a new concept - see splog hubs for more details - but that’s only been used to hide the real IP address in the past). The difference between that method and this method is that you use more than one scraper and then validate that the responses are the same. If they are, you’re good, if they aren’t the same (because there is a unique token in the content) the content can be either thrown away or the splogger can attempt to clean it up.

This would make it much harder for sites to protect themselves from sploggers attempting to steal copyrighted materials. So why am I writing this? Because I still have a few tricks up my sleeve to stop sploggers, but I thought it should at least be known that there are ways around some of the more obvious protection mechanisms.

Google Ads Spread Malware

Friday, April 27th, 2007

This is actually a really serious issue that was sent to me. The funny part is that I’ve known this was possible for years now and even already put it into a presentation I’m doing in a few weeks, but anyway Google’s ads have been spreading malware. A few people with Google accounts have been buying sponsored ads (no doubt with stolen credit cards/identities). It’s sure easier than getting to the top of the search results page!

Although I don’t think this signals the end of text ads, I think it’s a wise choice to consider any paid links to be just as untrustworthy as anything on the SERPs. Google, nor any search engine have been particularly good about vetting how good or bad a domain is before linking to it. Hey, money is money right? Although, I believe they will probably do a cursory scan of the domain to make sure it isn’t spreading malware in the future given the bad PR, it’s pretty easy to fool spiders into not seeing malware. So I’m not sure what actual protection this will provide.

My next thought was CSRF - if you buy a search term and include a few images to remote domains you can pretty easily get them to do things on your behalf, and it’s extremely targeted at the same time. Yah, that’s bad. Don’t trust those paid ads - it doesn’t matter if they are “sponsored” or not. As a side note, I was a little annoyed to read that Matt Cutts wants people to snitch out paid links. I think Google should look at it’s own problems before trying to hurt people’s revenue streams. At least with my paid links, I wouldn’t be risking people’s identity to click on them!

Clickbot.a Writeup

Friday, April 20th, 2007

I was sent this link today on Clickbot.a written by the Google adwords guys. It’s a pretty interesting high level read for the most part, if you don’t know much about click fraud, but does get into some of the technical stuff near the end on how the bot actually worked. While the conclusions of the paper are fine, I was struck that the authors failed to address the most important point.

The most important point being the only reason this bot existed, and the only reason the hackers used it to compromise 100,000+ machines - because it was economically lucrative to do so. That means Google’s detection was too slow to respond to and prevent the attackers from making enough money to make it worth their while. Also, it was at the expense of the advertisers as well as the poor web sites who were compromised for this purpose no less. Which means that Google’s detection methods need to improve to not just pick up this particular variant but also polymorphic versions that are far less easy to detect. So while it is commendable for Google to fix this one issue, it shows they are lacking the technology to pro-actively defend against future and less immature variants.

While Google’s executive management feels that economics will solve this issue I feel that Google is failing to see how detrimental this is to the advertisers who depend on quality click traffic. In lieu of this quality, alternative solutions must be in place to allow advertisers to recoup their costs while Google struggles to build new technology to defeat the issue. However, without access to the actual landing pages that the advertisers use, Google cannot have deep insight into the full picture. Ultimately, this will cause a bigger rift with time that the attackers can exploit on the vast majority of sites that don’t use alternative click quality tools. Until the time when Google can come up with a creative solution, companies like Click Forensics fill that void.

Spider Trap For Stopping Bots

Friday, April 20th, 2007

David Naylor (a semi-reformed SEO Blackhat) has an interesting writeup on how to stop badly behaving robots from spidering your site. I would hardly call this technique new (I’ve seen this scripts in one form or another for nearly a decade). However, it’s a good primer for anyone who runs a big website and who is otherwise powerless to stop people from doing it.

This technique doesn’t just work on robots though. Often people during manual assessments will look at the source code of a page and if they find hidden links or commented out pieces of code they will follow them, hoping to find something interesting from a security perspective. One alternative is to trap them and either put them into the matrix or ban them or otherwise log the activity. David’s article is worth a read if you are unfamiliar with how this stuff works.