Professional Search Engine Optimization With PHP Book
I guess I get to add another book to my X-mas list for myself. Jaime Sirovich wrote me about a book he has been authoring for quite a while now. I knew he was writing it but apparently they have been moving ahead a good clip and are already ready for pre-orders.
The book is Professional Search Engine Optimization with PHP and is designed to teach the fundamentals of SEO. Jamie thought I would be interested in the chapter on Blackhat SEO, since that applies to some basic web application security concepts, including HTML injection (as opposed to JavaScript injection which traditionally isn’t read by search engine spiders since no one searches on JavaScript). Pretty interesting with some good code snippets. If you’re interested in SEO, I’d add it to your list of upcoming book purchases.



November 16th, 2006 at 2:39 am
RSnake, i heard many times that spiders don’t eat JavaScript, but on many client sites i found otherwise. I had some menu’s build in strictly JavaScript, still in the logs -i’m always in the logs btw- i found out that Google did eat the JavaScrript links, cause the sites we’re indexed. It is also a big myth that a spider can’t index pages like: product.php?req=foo only if the var is set to product.php?id=123 Google won’t index it. You them map structure: /home/products/foo well that isn’t indexed better then a single php script like: product.php?req=foo. Really i found out that the latter is better indexed! this is purely my experience with constant analysing the logs. any ideas on this one?
November 16th, 2006 at 5:56 am
hrm, do you have an example of a page that was indexed by javascript? i.e. using link:thatpage .. it should find the original page that had it embedded in javascript.
Otherwise, it might just be from someone, somewhere, linking to that hidden page. I also wouldn’t be surprised if they pulled pages from Google Toolbar logs. Haven’t tested though..
November 16th, 2006 at 9:12 am
Two comments, I didn’t mean to say it didn’t follow the JavaScript, I meant to say it didn’t index it (meaning you can’t search on it) - that’s a half truth now that Google introduced it’s code search, but you get the point. For normal searching it’s useless to have there.
My second comment is that you have to be careful you aren’t just looking at the user agent. Lots of people surf as GoogleBot. Unless you are 100% sure that is coming from Google’s domain I would bet it’s a user spoofing their user agent.