Sometimes it Sucks Being a Search Engine Spammer
Somehow I ended up on the dumb side of a search engine spammer. I have no idea why anyone would think this would be a good site to rip off - you have to be a serious newbie to think that’s a good idea. Anyway, there I was, getting pingbacks and referring URLs and people telling me that my site was being ripped off by some dumbass. The only vaguely amusing side of this is that other SEO blogs have been hit by this recently too.
I’ve got to think this is just some sort of dumb joke, but that would be way too smart. No, this is just stupidity. So anyway, it was fairly trivial to figure out who was ripping my RSS feed. So it took me a few seconds to modify my document management system to do some IP delivery to the moron, and a few seconds of searching on the web for some nice prescription drug spam and poof! His site now looks like a bad spam doorway page and will continue to do so even more so with every post he indexes.
Then I do a little research on the idiot himself and I find out all his infoz:
– removed –
Why on earth would you hand pick this site, out of every site on the web, and think for a second I wouldn’t fuck with you? Way to go, moron. I’m not a spammer hater, but come on. Get a clue!
He also registered all his domains with Godaddy, and this is totally against their TOS - (copyright infringement). He’s lucky I don’t get all his sites nuked, dumbass.



July 12th, 2006 at 7:26 pm
Now THAT is just too damned funny!
Vic
July 12th, 2006 at 7:56 pm
Thank you, thank you… Some people just don’t “get” it. But at least it was worth a laugh.
July 13th, 2006 at 3:42 am
OMFG ROFL , such a dumbass
btw, that was a cool way to pay it back, even if you should have put a warez or pr0n site on his webspace ^^
anyway– i had to laugh about ten minutes about this
July 13th, 2006 at 7:58 am
[…] He thought wrong. I’ve got to think this is just some sort of dumb joke, but that would be way too smart. No, this is just stupidity. So anyway, it was fairly trivial to figure out who was ripping my RSS feed. So it took me a few seconds to modify my document management system to do some IP delivery to the moron, and a few seconds of searching on the web for some nice prescription drug spam and poof! His site now looks like a bad spam doorway page and will continue to do so even more so with every post he indexes. […]
July 13th, 2006 at 9:40 am
Heheh, I love that he works at Best Buy… that’s my favorite part. Btw, in case anyone is curious, I believe he uses Magpie RSS to aggregate the content.
July 13th, 2006 at 11:09 am
[…] Manchmal sollten sich die Scriptkiddies wirklich überlegen, wen sie aufs Korn nehmen. Manche finden es ja eine tolle Idee den Feed einer fremden Seite als eigenen Inhalt zu verkaufen. Dumm nur, wenn man sich ausgerechnet die Seite eines Hackers aussucht. […]
July 13th, 2006 at 12:28 pm
Very nice Rsnake!
Would you mind sharing the code? He is scraping me as well, not that I care, just for the fun.
July 13th, 2006 at 2:44 pm
countzero and rsnake,
i own zigzo and i have removed that plus the rest of the splogs that were created. magpie RSS was not installed on the server and i think that someone was using an application rather than a script to splog. The actual blog script was wordpress-mu.
Some of the other splogs that were listed on zigzo had redirect codes and whatnot injected into the posts some how.
Anyways, to both of you i am sorry your content was stolen and please know it has been totally removed.
On another note Brett is NOT the owner of the sites and has absolutely nothing to do with any of the sites you mentioned via the RSS comments that were posted on seojournal.zigzo.com. He is a friend of mine and he used to own angelicamateurs.com. This site was hosted on my server at one point but it is no longer and clearly says so by the whois. He got out of that game a LONG time ago.
I beg you to please stop posting his information.
Again i apologize to both rsnake and countzero for the bs this shit has put you both through and myself included. Also, i have included my email in this comment… zigxxx@gmail.com you can both email me for anything at all.
thank you.
July 13th, 2006 at 2:45 pm
i see you have removed his info already.
thank you SO much. i appreciate this x 1 million.
July 13th, 2006 at 3:04 pm
No need, countzero! Woo! I was able to completely shut down the site! Here is the text from the website (zigzo.com):
Wait, who’s the spammer? Nooo, that would be YOU Zigx - I just made use of your own automated copywrite infringement and turned it against you. Sucks to be taught a lesson, huh? Try swimming with the kiddies before jumping in with the sharks, kay? I’ve just got to say it again, because it just amazes me - why on EARTH would you think ripping this site, out of every site in the whole wide world, would be a good idea?
Now the schmuck would have me believe that the owner of the domain the machine was hosting was not the same owner as the website on it. If Zigx would like to tell us his real identity, I could redirect my efforts towards him instead. He’s lucky I didn’t completely wipe out every machine in his whole IP range AND get his domains revoked. Frankly I think I was pretty nice. I barely did anything to him actually.
And btw, ALL of the webmasters that had their sites ripped off run legitimate blogs, Zigx was the only one who wasn’t! Anyway, since I was effective in my anti-newbie mission I’ve removed the personal info from my site, so he can rest easy at nights.
And now, it’s miller time!
July 13th, 2006 at 3:24 pm
Zigx, I saw your posts after I posted mine, which is why this seems out of order. Anyway, all’s forgiven, data’s been removed (stored, but removed), and the world is all back to normal.
Next time, ask. Nuff said.
July 13th, 2006 at 3:47 pm
Alrighty then, nothing for me to do here.
Moving on..
July 13th, 2006 at 5:03 pm
“So anyway, it was fairly trivial to figure out who was ripping my RSS feed. So it took me a few seconds to modify my document management system to do some IP delivery to the moron”
Nice, nice work
Would love to hear a technical explaination of how you did it. I’ve got a few people stealing my stuff, too.
July 13th, 2006 at 5:20 pm
Pete: Read my blog in a bit, I’m posting on it — www.seoegghead.com
July 13th, 2006 at 5:33 pm
Hahah, no no, Matt, I’ve got this one covered, you can rest easy.
Thanks for posting!
Pete, sure, it was pretty trivial actually. I saw a pingback from an IP address (66.28.59.112) that had my stuff in it pointing to a server at 66.28.59.117 (seojournal.zigzo.com). Notice how close those IPs are together. That’s how I was able to get all those other domains tied together.
Then I went and parsed through my logs seeing what other things that IP address had done. It turns out it also happened to be checking my RSS feed once every hour or so (kinda random, I’m not sure why exactly).
So then I modified my trusty blogging tool (finding the right place to modify was the only hard part) to do some IP delivery based on the IP address with something like so:
if ($_SERVER[’REMOTE_ADDR’] == “66.28.59.112″) :
And in the case of that IP address, I delivered some Cialis spam to him, plus some specially crafted HTML that ended in an open comment tag.
The open comment tag killed any content after it until it reached the end of the next comment (which was WAAAAY down the page). Making it so that the link to me and my domain was now commented out, and most of the rest of the content on the page was messed up.
I also added a Chinese bad word that will block the China firewall from letting traffic through so that no one in China could see the page (including Baidu.com and it’s robots).
And then, every time I posted, it would update the time of day it was posted, with the real title, but with my fake Cialis spam content, and poof! The deed was done.
The only hard part was finding out everything about him (his age, his birthday, his domains, his email addresses, his phone numbers, where he lived, who he lived with (his Dad, and his Dad’s info), where he worked, what he looked like, what his hobbies were, where he went to school, what his major was, who he was friends with, where they went to school, blah blah blah…). I included that in the spam, just for good measure. That was the clincher.
I’d be a really good stalker if I weren’t so damned lazy.
July 13th, 2006 at 6:13 pm
[…] Incidentally, some scraper, here, was stealing my content and posting it, verbatim, on his site. I never authorized this. He was even linking back and thusly sending pings to me — I got alerted to each "citation." Not so bright. He also struck ha.ckers.org, a site about software security with some SEO stuff as well. That was even less bright. I was eventually going to report the site (see below), but RSnake at ha.ckers.org had a rather amusing way to deal with the problem. […]
July 14th, 2006 at 10:37 am
Hey - thanks for the great, amusing post. That’s a great way to turn the tables on a full-RSS scraper - I also subscribed to your SEO Egghead feed to keep uptodate on other tactics you come up with.
BTW, I noticed you have both Yahoo YPN and Google Adsense ads on the very bottom of your page here. It may have been an oversight on your part when redesigning or otherwise switching up ad networks, but I *think* it violates (at least) Adsense TOS to show their ads with similar ads from another network on the same page…
Just FYI and thanks again for this post,
-Steve
July 14th, 2006 at 4:39 pm
Rsnake - cheers! I’m going to have some fun with this…
July 15th, 2006 at 4:15 am
I’ve no idea why this didn’t occur to me earlier. I just implemented a similar system to a couple frequently scraped sites of mine and I’m now even thinking of setting up some scraping honeypots.
Of course, instead of making the scraper sites look bad, I just put my ads/links on there. This is so fun! (and easy money)
July 15th, 2006 at 8:20 am
Haha, I’m glad to hear this post was more than just amusing. Please let me know how it goes when you get the results back. I’m always curious to know how these things turn out.
July 16th, 2006 at 6:55 am
amusing indeed.. don’t scrape that much
July 16th, 2006 at 9:47 pm
[…] There is a great post over at hackers.org detailing how to stop content thieves republishing your blog. […]
July 17th, 2006 at 5:23 am
this gonna be the “best practice” for any future case of rss-scraping. thanks a lot.
July 17th, 2006 at 6:16 pm
[…] White hats, recently, found another positive use for cloaking, the ability to stop scraping by providing different content to a scraper than to the rest of the world. This has proved detrimental to one splogger and has earned one hacker his fifteen minutes of fame. […]
July 19th, 2006 at 6:04 am
[…] The comics above is inspired by what Ha.ckers.org did to a content thief. It was so funny. I’m just guessing here though — I don’t actually know if RSnake could turn off electrical power at the offending party’s home. That would’ve been cool though. […]
July 19th, 2006 at 7:57 am
[…] Now have a look at this and this. This remind of the people who rip off blog content of capable people such at RSnake over at ha.ckers.org. […]
July 19th, 2006 at 8:33 am
Just in case I find myself in this situation, where was the modification made to your software? Please feel free to email me if you don’t want to post it.
July 19th, 2006 at 8:34 am
[…] Isulong SEOPH just published an amusic comic I thought I’d share with all of you. It’s regarding the content theft that happened last week. It was pretty funny, so I thought you might get a kick out of it. (Click to enlarge) […]
July 19th, 2006 at 8:41 am
Hi, Jonathan, if you check out http://www.plagiarismtoday.com/?p=287 you can get a look at a few different methods of doing this. I gave the peice of code to him, although there are more ways than one to skin this cat, this is just the particular way I chose to do it.
July 19th, 2006 at 10:25 am
That’s definitely a good way of dealing with the scraper, but I think you were too nice! I probably would have been more harsh, making it add content that specifically violated the GoDaddy and/or host’s TOS and then report it. Either way, it makes for an amusing story.
July 19th, 2006 at 12:13 pm
[…] 8 ha.ckers.org article “Sometimes it sucks being a search engine spammer” […]
July 20th, 2006 at 4:02 pm
Considering the guy ripped my whole site, categories included I think he deserves all he gets. He was not too aplogetic when I emailed them about it, did not even bother replying actually.
Only aplogetic now as someone with a bot of net savy caught him out !
Rant Over !!
Alistair
July 21st, 2006 at 4:20 pm
[…] I’m liking this RSnake character more every day. I liked him when I first heard of him on ThreadWatch, and I truly enjoyed his handling of the splogger last week that was republishing his feed on an MFA site. I had been in conversations with IncrediBill previously trying to get him to understand that blocking content thieves is not merely as much fun as feeding them customized content, but I didn’t feel Bill was getting it… too caught up in that “shut ‘em down” behavior. And so now RSnake does a nice job of highlighting the stupidity of picking, of all the feeds in the world, his feed to steal. Well, now I am wondering the same thing about INGDirect. You are an online bank, highly regulated, and considered an enemy by many traditional banks. You have limited offerings, and have to go to great lengths justifying why your offerings might be ok opportunities for those in need of home loans and mortgages. You obviously need affiliates, since you signed on with RegNow. So why in the world would you treat your existing customers… especially the early adopters of online banking, like crap? […]
July 25th, 2006 at 8:20 am
[…] Insulong SEOPH published a follow on comic having to do with that SEO content theif a few weeks back. This time I apparently am getting him arrested for child pornography (something fairly easy to do, by the way - using XSS via browser caching no less). Also, I should probably comment on the last post as well, yes, I can turn people’s power off. That would have been a terrible idea though, as he already knew who I was. Federal time just isn’t worth it - I’d rather mess with him for a laugh to be honest. Anyway, enjoy: Click here to enlarge […]
July 27th, 2006 at 5:22 pm
Well done!
“He also registered all his domains with Godaddy, and this is totally against their TOS - (copyright infringement). He’s lucky I don’t get all his sites nuked, dumbass. ”
I personally think you should have notified GoDaddy, because he probably will continue ripping content out of other websites and put it on his other domains, just being careful to avoid content from other SEO/spam experts.
Michaël
July 28th, 2006 at 12:30 am
[…] http://ha.ckers.org/blog/20060712/sometimes-it-sucks-being-a-search-engine-spammer/ […]
July 28th, 2006 at 7:28 am
very very nice RSnake! that was an amazing and amuzing post. one of the best…
I hate content thieves
July 28th, 2006 at 11:48 am
Can I ask a question? How does somebody rip off your RSS or site? How does the average web owner detect it?
Maybe this is a newbie question, but I have heard of some things like this, and I don’t quite get it yet.
Any links or ideas would be appreciated.
Thanks.
July 28th, 2006 at 1:04 pm
DCackle, unfortunately, there are about 50 different ways to steal web content from a site. The easiest way is just to set up a timed task to pull down the most recent content from the homepage once per day (you could use wget or some homegrown script to do that). That doesn’t work so well if you want to integrate it into a CMS, but there are a few programs out there that do that, like RSStoHtml scripts (a la http://www.feedforall.com/free-php-script.htm). The guy who was doing this was using something like the latter (although I never got confirmation on exactly what the code looked like).
There are dozens of other ways. This particular guy was being extremely obvious, by pulling the information from the same machine he was posting to. There were no less than three signatures that allowed me to see him. Firstly, he did trackbacks which means my CMS was asking me if I wanted to post a link to his site from mine… uh, no! The second was the referring URLs coming from his server. I watch my HTTP logs like a hawk, so this was pretty obvious. The last, was that he was pinging technorati, so by doing a search for myself I saw him all over the search results with my content. Pretty obvious!
For people who have no access to server logs and the code itself you can go the v7n route and insert a keyword into a few posts and search for them to see where they end up. If they ever show up in a post somewhere, you know someone has been stealing your content.
If you don’t want to watch your logs, you can do something more automated like put absolute URLs to images in your listings. The absolute URL could be to a PHP script that then shows the image in question but logs the referring URL. If the referring URL is anything but your server, it’s time to investigate. I hope that helps a little!
August 1st, 2006 at 7:59 am
Thanks, RSnake! I appreciate the help.
August 7th, 2006 at 9:33 am
[…] Actualización 2: Creo que es justo mencionar que la idea del cloacking fue via Sigt.net y ha.ckers.org (otro domain hack por cierto). […]
July 11th, 2007 at 2:26 pm
Would you mind sharing the code? He is scraping me as well, not that I care, just for the fun.
September 11th, 2007 at 7:48 pm
Well it looks like someone removed his site anyway, so that ought to give you some piece of mind. If I ever found out that anyone was doing that with my site, I would find out who they were and I would take them to court and sue them for about a million dollars for destroying a growing enterprise and turning it into worthless spam. Sometimes I wish someone would do that just so I could show the world a lesson. Don’t you wish that sometimes too? On the other hand, it isn’t said all the time that imitation is the sincerest form of flattery? So you could take it as a complement that someone thought your site was good enough to copy and take advantage of. I know that at this point, my personal site isn’t quite large enough to have reached its full potential and I can use all the help I can get from people linking, etc. Although I certainly CAN’T use someone pushing me down in the Google rankings and making me look like a spammer. I already don’t have that high of a Google ranking and probably never will…so sad.
October 5th, 2007 at 12:59 am
If I were in your place I’d also feel the same way. Knowing that this moron had his urls registered at GoDaddy, did you go after him at GoDaddy? Did you inform GoDaddy about this person’s monkey business?
Sometimes it’s frustrating to know that other people aren’t too busy that all they can think of is to destroy other’s people’s lives.
May 19th, 2008 at 12:26 pm
nice story
November 7th, 2008 at 6:14 am
Would you mind sharing the code? He is scraping me as well, not that I care, just for the fun.
January 22nd, 2009 at 5:17 am
heY Rsnake,
It was a great way to deal with him. I am a newbie to this entire thing, was just looking to find out ways to avoid automated scrapers to scrape data from my site. Is there a way by which i can prevent the scraping of data from the pages instead of reacting to the scraping attacks.?
I think it would be difficult to stop it but i want to make it as hard as possible so no newbie atleast can get hold of data.
Thnks, looking forward to a helpful reply.
rgrds,
Vikas
February 4th, 2009 at 10:10 am
I think it would be difficult to stop it but i want to make it as hard as possible so no newbie atleast can get hold of data.
February 4th, 2009 at 10:11 am
If you don’t want to watch your logs, you can do something more automated like put absolute URLs to images in your listings. The absolute URL could be to a PHP script that then shows the image in question but logs the referring URL. If the referring URL is anything but your server, it’s time to investigate. I hope that helps a little!
March 15th, 2009 at 5:51 pm
Can I ask a question? How does somebody rip off your RSS or site? How does the average web owner detect it?
Maybe this is a newbie question, but I have heard of some things like this, and I don’t quite get it yet.
Any links or ideas would be appreciated.
Thanks.
March 16th, 2009 at 4:23 pm
Would you mind sharing the code? He is scraping me as well, not that I care, just for the fun.de
January 21st, 2010 at 4:14 pm
If you don’t want to watch your logs, you can do something more automated like put absolute URLs to images in your listings. The absolute URL could be to a PHP script that then shows the image in question but logs the referring URL. If the referring URL is anything but your server, it’s time to investigate. I hope that helps a little!
March 6th, 2010 at 7:09 am
Would you mind sharing the code? He is scraping me as well, not that I care, just for the fun.de
March 13th, 2010 at 2:45 pm
I’ve no idea why this didn’t occur to me earlier. I just implemented a similar system to a couple frequently scraped sites of mine and I’m now even thinking of setting up some scraping honeypots.
Of course, instead of making the scraper sites look bad, I just put my ads/links on there. This is so fun! (and easy money)