Custom robots.txt betters blogspot blog design

Yesterday I came across SEO-Hacker’s nice article outlining the advantages and dangers of relying on robots.txt for controlling the crawlers.

Which set me off into finding out ways to optimize this blog template design - after all, who isn’t in need of more readers?  And if more visitors can be attracted by simple blogspot template re-design, why not?
:-)

There are 2 simple basic rules.  These restrictions avoid the bots from tagging the website content as duplicates and penalizing it. 
  • Do not allow crawlers like Googlebot to index the search results.
  • Do not allow crawlers to index the archive pages.
After rooting around, it soon dawned on me that Blogger sometime back introduced the custom blogger.txt option in the Blogger settings.  It was time to explore, while hopefully not skewering the blog. 

Sure enough, going into Settings -> Search Preferences -> and clicking Edit on the last option Custom robots.txt tags opened up the options.

To be careful and not mess up the crawlers too badly, I decided to do exactly what I came in for.  Prevent crawlers like Googlebot from indexing the ‘Archive and Search’ pages.  And Google Blogger settings has exactly that option, apart from the ones for ‘HomePage’ and ‘Defaults for Posts and Pages’.
  1. check the ‘all’ in both home and posts/pages sections.  That is allow the crawlers to index and follow everything on these pages.
  2. check *only* the ‘noindex’ checkbox in the archive & search pages section.
  3. check the ‘noodp’ checkbox in all the 3 sections.
Switching on the ‘noodp’ option was as afterthought on reading this 2006 post by Matt Cutts (yeah!) on NOODP meta tag.

blogger_custom_robots_txt_settings

It might be a good idea to switch on ‘nofollow’ tags as default under all the 3 sections, but my understanding isn’t still clear and would welcome any suggestions.

From what I understand, the ‘nofollow’ tag prevents the crawlers from following any links in my website.  If I don’t use nofollow, then whatever PageRank my blog has will be distributed among those links.

OTOH, I do want other ‘important’ pages of my blog to accumulate PR.  For e.g., this blog’s homepage PR is 3.  But AFAIK, none of the individual post pages have any PR at all.

Another confusing issue is the Google Authorship tags.  If I use ‘nofollow’ tags as default for all pages, will the Google Authorship tagging be affected?  Google Authorship only works when the content is ‘linked’ to my Google+ profile (and vice-versa).
:-P

Update:  As of today the Googlebot had crawled this blog on Mar 16, well before (just on?) Panda update #25 and before this tweak.  Let me see what this tweak does with the Googlebot.

BTW, looking through the blog design template, I found a meta tag inserted sometime back to take care of noindex issue on ‘archive’ pages.  Note it says ‘follow’ and not ‘nofollow’.  Deleted the meta tag now of course!

nofollow_meta_tag_blogger_template

2 comments:

  1. Tejas Rathod17/06/2013, 11:27

    your method is good but i think my method is easy to implement check out here http://chillofyblogging.blogspot.in/2013/06/how-to-setup-custom-robotstxt-in.html

    ReplyDelete
  2. hi @rathod. You have a nice blog and a nice point. Unfortunately, it won't work for custom domain hosted on blogger. Here is my post on that :
    http://www.madmadrasi.net/2013/06/sitemap-xml-in-blogger-for-better-seo.html

    ReplyDelete

This blog uses the Disqus commenting system. If you try to post comments through the usual Blogger comment form, they will not appear on the Blog.