Thursday, March 8, 2012

rel=canonical bullshit in Blogger

Ask anyone who has attempted a Google Blogger to WordPress migration what their biggest headaches are, and the answer will probably be "canonical bullshit." It's unfortunate, because rel="canonical" would otherwise be an acceptable solution to the problem of duplicate content. Although canonical support is implemented in Blogger, the way they've done it makes it hard to have a clean canonical link to the new Wordpress content.

I'm not sure this was always the case. This blog post explains how to add canonical links pointing to a new site, but when I tried it I discovered that Google has done something that leaves an extra canonical tag in the header. If you look at the template in Blogger, you won't see any reference to "canonical" but if you check the source code it turns up, and points to the URL of the source code.

Why does Google do this? I don't know. But I know how the canonical code is getting added, via this bit of code in the template:

<b:include data='blog' name='all-head-content'/>

The result for me: after manually adding another bit of code to add a canonical link to a new wordpress blog, I found that the resulting blogspot page now has two canonical links: One pointing to my new wordpress blog page, and one to the original blog page on blogger. This surely is a big no-no to Google, and might have a detrimental effect on ranking if the algorithm thinks I am trying to confuse it for illicit means.

I am tempted to delete so I can have dedicated canonical links pointing to just the wordpress page, but doing so would unfortunately strip out other necessary code, including charset, favicon.ico, a bunch of RSS code, and several scripts that help with rendering in IE and other functions.

For what it's worth, I have written a book about using Google Blogger for small businesses. There are lots of tips on how to use Google Blogger to build a business website, but I don't delve into the code too much (aside from a little CSS).

Blogger settings rel=canonical
Blogger settings page
 

2 comments:

  1. I had the same problem and was very annoyed too! I found http://www.growingwiththeweb.com/2013/12/how-to-remove-default-blogger-assets.html which suggested an idea to me: I put in style tags around the b:include as below, and then I copied back in the lines from the head-content that I wanted, but changed the href associated with rel="canonical".
    Code sample below, but not sure it'll display (or look at the article I reference, except I needed 2 closing style tags instead of 1.)
    < style type="text/css" > /* < style>
    < b:include data='blog' name='all-head-content'/>
    < /style>*/ < /style>

    (Replace the < and > with < and >, of course, and format properly - as mentioned, I was having trouble displaying this code snippet.)

    ReplyDelete
  2. I should add that someone who spends a lot of time writing about the inner workings of blogger thinks that a technique like this could encourage Google to suspend the blog as spam. (See http://blogging.nitecruzr.net/2013/03/some-blogger-blogs-being-locked-as.html which is a nice list of things not to do to your blogspot blog unless you want to look like you belong to a spam farm. Monkeying with blogspot's own rel=canonical tag is third on the list.) By that logic, one is just stuck with one's blogspot and needs to make the best of it (here's another post by the same author about how "Blogger blogs cannot be used as gateways," http://blogging.nitecruzr.net/2013/04/blogger-blogs-cannot-be-used-as-gateways.html ).

    I did take that code sample down off my own blog since I'm not 100% certain it was working.

    ReplyDelete

I will review and approve comments as soon as possible, but spam, personal attacks, and rude messages will be deleted.