THE RKGBLOG

How rel=canonical is Breaking Sites

It’s been several months since the link canonical tag was announced, and it’s being used fairly liberally out in the wild. It’s patently clear to us that this tag is quite powerful and effective, and the consequences of its misuse very serious. It’s being misused a lot (not a surprise). We’re seeing a ton of sites with poor rel=canonical implementation. The end result: it’s causing havoc.

Yes, rel=canonical is breaking websites. I’ll share a few anecdotes in this post that show just how bad it can be.

Why Link Canonical is Dangerous

Part of the problem with rel=canonical is that it’s extremely easy to implement. Just throw a meta tag into the head of a page and you’re good. That very ease belies the power of the tag. Google announced to us at SMX East last October, that 2 out of 3 times the link canonical target influences the organic decsion. That’s right — 2/3 of the time your rel=canonical target is affecting the crawl and indexation of the page, and in turn, the ranking of the page.

You can see how easy it is to mess this up. Now, when clients tell me they’ve “already got plans for the link canonical tag” I get all weird and anxious on the phone… it scares the heck out of me. At least with hard redirects you can visually see the change occur. With the link canonical in place, you don’t see anything, unless you look at the source. It’s like nothing happened at all.

You’re Doing it Wrong

At SES Chicago this month on the Duplicate Content panel, I shared the story of how a client of ours with about 100,000 SKUs on one of their sites had somehow put a link canonical target of the home page on every single page. Every product page, every category, every section, literally every page pointed back with rel=canonical to the root domain. We didn’t realize this had happened until 2 or 3 months after the fact, because we weren’t working on that particular site at the time. You can imagine what the traffic profile looked like.

Susan Esparza calls it like it is

Bruce Clay told me recently, that “we’re recommending that the link canonical be implemented only with professional help.” Amen to that.

Other failures we’ve seen with implementations of link canonical tags:

  • The link canonical points to itself. This is fine when there are no other options during implementation, but not advised for sitewide usage because it can introduce unexpected behaviour. Be careful with this one.
  • A link canonical chain is created, with canonical targets pointing to multiple URLs, and back and forth, becoming a web of confusion. For example: the http://www.mydomain.com link canonical points to http://www.mydomain.com/index.html and that one points back to the canonical version. Choose a canonical version when duplication exists. Stick with it – consistency is key.
  • Link canonicals on deep pages (such as product pages) point to the category or parent URL.

Now that Google is supporting cross-domain usage of rel=canonical, which is fabulous news for advanced SEOs, I imagine it’s going to be even worse out there. Please people, be careful using this tag, and get professional help. Incorrect use of rel=canonical has serious implications.

  • Adam Audette
    Adam Audette is the Chief Knowledge Officer of RKG.
  • Comments
    51 Responses to “How rel=canonical is Breaking Sites”
    1. @digeratti says:

      Good post Adam. Why do you think they would canonical tag to the homepage? Error in automation or just plain simple misunderstanding of the tag?

    2. jaamit says:

      A very wise warning Adam – I had a similar problem on a client’s site which I’ve been meaning to blog about for a while but haven’t gotten round to it! :)

      We implemented canonicals on product pages across the site to pick up some duplicate content issues (pagination, sort by parameters, etc). But we failed to realise that we’d put the wrong URL format within the canonical tags so that they all pointed to non-existent URLs! o_O

      Within a couple of weeks all the product URLs dropped out of Google and interestingly Webmaster Tools reported hundreds of 404 errors, treating the incorrect canonicals as broken links. We fixed the tags quickstyle and rankings came back, together with the additional boost we were originally expecting!

      I honestly wasnt expecting an error like this to have such drastic consequences – after all it’s supposed to be a “hint” right? Well it showed how powerful the canonical tag is, and as you point out how it can get you in trouble.

    3. Adam Audette says:

      @digeratti I believe the error crept in because the link canonical tags were being dynamically generated. We only found out about the bug after traffic had fallen off to product pages.

      @jaamit great learnings, thanks for sharing. It’s easy to make mistakes such as those, way too easy. Part of the problem is the link canonical tag doesn’t really “do” anything to the page – it still renders exactly the same way, same URL, etc… so you have to monitor for consequences to see what’s going on. It’s like a back door to trouble!

    4. Dr. Pete says:

      I’ve heard two very similar horror stories just in the last month. In both cases, it involved a CMS snafu where someone unintentionally canonicalized 1000s of URLs to the home-page. The effects on their index were quick and disastrous.

      Fortunately, in at least one case, the recovery was pretty quick, post-disaster, but it’s yet another example of how any tool can be dangerous in the wrong hands. Major architectural changes shouldn’t be made because someone read one SEO blog post.

    5. Adam Audette says:

      @Dr Pete – thanks for commenting, totally agree. It sounds like this is a very common issue. Will be interesting to read other horror stories as they begin to surface.

    6. I think this article is pretty useless. It’s like saying HTML is breaking websites because someone coded the site wrong. It’s not the canonical tag breaking websites, it’s dopey developers and SEO’s.

    7. the problem is most CMS systems need them but they are not configured for them.. which means.. bad code.

    8. Adam Audette says:

      @Jared From Subway thanks for trolling by, rel=canonical offers unprecedented control of crawling and indexation for such trivial implementation. For SEOs in the know… that’s a big deal. As far as I know, the search engines haven’t released a supported HTML tag that can break websites.

    9. S Douglas says:

      Hi,

      Your article intrigued me. But it also scared me.

      Why? Cuz I’m a lame-O and I don’t know what “rel=canonical” is… and why would a link named this be dangerously exciting?

      I was going to ask Mintz-T for an explanation, but I know he’s pretty fed up with answering my dumb*ss questions. So you’re it!

      cheers!

    10. Adam Audette says:

      @S Douglas that’s easy, here:

      Learn about the Link Canonical Element in 5 Minutes:
      http://www.mattcutts.com/blog/canonical-link-tag/

    11. Adam, I notice in your source code that you’re using the All In One SEO Pack plug-in (as do I, and I’m sure many of your readers), but not for rel=canonical. I tried ticking that check box in the control panel and found that it set every page’s canonical URL to be its actual URL. That makes sense, I guess, since while a given blog post may be found on its own page, the home page, and the archives for that month, the category to which it’s assigned, and whatever tags are on it, that doesn’t necessarily make any of those pages duplicates or even near-duplicates of each other.

      However, if every page is the canonical version of itself, that means that no page serves as the canonical version of another page. So doesn’t that make the presence of the tag completely superfluous, like putting a meta robots tag of “index,follow” on a page?

      I may be missing something, but I can’t find a reason to use that particular function of the plug-in. Is there a setting I haven’t noticed that will actually set one URL as the canonical version of a different URL?

    12. Adam Audette says:

      @Bob – yes, good point. I haven’t used the canonical option in All in One SEO. It sounds though like if permalink/post pages are pointing to themselves, there shouldn’t be a problem. As you say, it’s unnecessary and superfluous.

      As a general statement, when rel=canonical is configured for larger sites it’s easier to be ‘baked into’ each page by default. So we get lots of canonical targets pointing to themselves. This doesn’t worry me so much, by itself, but may introduce unexpected behaviour down the road.

      If at all possible, I recommend only using rel=canonical when a specific requirement warrants it. Not always possible, but preferred and keeps more control in general.

      Anyone else have thoughts to share on the canonical feature of All in One SEO Pack?

    13. Mark Knowles says:

      S Douglas,

      Those of us who know you, know that you are no dumb*ss. I think it goes like this…
      1) Google has a good idea to solve a problem
      2) New solution has unintended outcomes
      3) Be careful

      ;-)

      You’re wanting more huh?

    14. S Douglas says:

      @Hi Mark

      I’m sure that Mintzy will tell you I know jacksquat about SEO and programming, coding, designing, etc. I just look at stuff to see if it works, will it make things easier to make a profit, does it hurt, and is it legal. Then I hire somebody else to do the expertise once of all those criteria are first met.

      Thanks for the “no dumb@ss” props, tho! lol

    15. Adam Sherk says:

      The caution on not having the link canonical point to itself creates a challenge for news sites in particular. A lot of publishers now use the tag to try to offset duplicate content issues caused by appending tracking codes to URLs (e.g. site.com/article-name?xid=rss-top-stories).

      Tech teams generally say they cannot apply the tag to only the coded URLs; from their perspective the articles are in fact only rendering on one URL. (They also say they can’t use 301 redirects or it interferes with their tracking capabilities). So they place the rel=canonical tag on the “clean” canonical URL so that no matter how many different tracking codes may be appended to it (separately), the correct canonical URL is present in the tag. But that means nearly every piece of canonical content on a news site ends up having a rel=canonical pointing to its own URL.

    16. Adam Sherk says:

      I didn’t mean for that example URL to come through as an actual link. site.com/article-name?xid=rss-top-stories

    17. Brian Carter says:

      The 100,000 SKU bad implementation made me laugh out loud.

      At Fuel, we concluded pretty quickly after this came out that since a lot of our sites are dynamic and have a site-wide header file, that it wouldn’t easily solve any problems for us.

      Thanks for the article!
      :-)

    18. Adam Audette says:

      @Adam Sherk – thanks for the comment, it’s valuable hearing your experiences on this topic w/ the big publishers you guys work with.

      Sorry about the URL – fixed it in your first comment.

    19. Adam Audette says:

      @Brian Carter – funny, yes, but so sad! Someone mentioned to me the other day that Stephan Spencer had found a case where Google was using rel=canonical wrong somewhere. Anyone know?

      I shared at SMX Advanced last June how Google’s dupes of DMOZ on directory.google.com and http://www.google.com was causing issues in some apparel-related SERPs. A week or so after that presentation, the problem I showed (for “clothing”) was fixed. I don’t think it was fixed w/ rel=canonical, though, and at the time they still didn’t support cross-domain usage. (Not only did Google display dupes of its own directory, but also the exact same content on DMOZ.)

    20. Jack says:

      Hello Adam,

      Re:

      The link canonical points to itself. This is fine when there are no other options during implementation, but not advised for sitewide usage because it can introduce unexpected behaviour. Be careful with this one.

      Google have said that should a page use a canonical tag to point to itself, this is fine and doesn’t do any harm – obviously your point contradicts this, but does it do so based on an actual occurrence (anecdotal or not) where this proved true? Or are your concerns more for the potential of the tag to cause problems in the future?

      Thanks,
      Jack

    21. Adam Audette says:

      @Jack – correct, a link canonical pointing to itself is technically fine, and officially supported. It can introduce problems. We’ve seen situations where duplicate content all have their link canonicals pointing to themselves, rather than to a single canonical version. But yes, the concern is really about the potential for this breaking things in unforeseen ways.

    22. gavin says:

      Adam — I had to respond to #8 up there

      Not exactly a “search engine supported” tag, but one that breaks websites all the time?… \…or should I say…breaks their Search Agent Compatibility?

      Anyway, I couldn’t agree more that some things (like the canonical tag) should only be handled by a trained professional (of course, I also believe that WYSIWYG is the worst thing to ever happen to the internet) — but the (in)visibility (or lack thereof) of implementation (or an implementation fail) points to the odd circumstance that internet-delivered content is still evaluated by all but maybe a handful of SEOs (and people using screen readers) based on purely visual criteria.

      Long story short – it’s too bad it’s forever amateur hour on the internet, and that every epic fail doesn’t automatically add p {text-decoration: blink;} to an offending website’s stylesheet…

    23. Alysson says:

      The ease with which this tag can be implemented by amateurs has been among many concerns about rel=canonical from the beginning. I hope people take your advice to heart and realize that not using it at all is a far better option than using it improperly.

      I always advise those who bring up the subject of the rel=canonical tag that if they don’t understand how to use 301 redirects properly, they shouldn’t even consider implementing the rel=canonical tag. It isn’t something to be taken lightly. It must be approached with a very specific strategy and a full understanding of why it’s being done, what it does and what using it will accomplish.

      It’s not the “quick fix” solution for correcting the problems site owners have avoided fixing previously – either because it was too time consuming or troublesome to do it right in the first place. It’s putting a band-aid on a gunshot wound and I’m not surprised so many mistakes are being made implementing it.

      By the way, I agree with Bob – the redundancy and superfluousness of including rel=canonical on all pages only to incorporate the actual page URL is apparent. I questioned the purpose of it when it was incorporated into the All-in-One-SEO and Platinum SEO plugins, and again when the standalone Canonical URLs plugin was introduced.

      If any of them allowed you to specify the canonical URL for the page, rather than simply adding the tag with the actual URL of the page, I might be able see a use for it. Without that functionality, there’s no constructive purpose that I can see to using the rel=canonical tags generated automatically by any of the plugins.

      There’s my two cents…okay, more like 25 cents. ;)

    24. embarrassed anbd broke webmaster says:

      I’m embarrassed to say I did the exact same thing. A coding error pointed the cononical to the home page for every single one of my 20,000 plus pages. Traffic from Google dropped by 80% from 10-12k visitors to 2-3k. :( :( :(

      Now the canonical is pointing to itself. Traffic has not returned so far — this has been 2 months to date.

    25. @petryshen says:

      The same is true can be said about any tag/code that does not affect how the page is rendered. We have a client in the classified space that inherently migrated 1.5 millions plus pages from staging to the live environment with a Meta Robots NOINDEX tag intact (they added it to the staging environment as an extra precaution). You ever wonder how long it takes to drop a half million pages from the index?

      In the end it comes down to having proper procedures in place at the ground level. The developers and engineers need to be educated, supported and provided with the tools to ensure these simple, but catastrophic errors do not happen.

      We put together a SEO course specifically for engineers by engineers that has helped tremendously.

    26. Adam Audette says:

      Good points @petryshen and a scary tale you tell about the noindex oversight. Education always helps.

    27. embarrassed anbd broke webmaster says:

      @petryshen — how long ago was that and do you recall how long it took to recover?

      (My guess is that within 2 weeks the pages started disappearing from the index and that they were gone within 7 more days)

    28. @petryshen says:

      Correct Adam. The error was spotted 3 weeks after implementation during some random tests. Thankfully, we were able to reverse the slide quickly after cleanup (we also made some minor content changes and resubmitted the XML sitemaps). Within a week had gained back over 70% of the pages. Within two weeks, the pages and and traffic returned to pre error levels.

    29. blogstalk says:

      What is the best way to handle canonical links on a blog site that shows recent posts on the home page and duplicates that content on pages dedicated to individual blog posts? For http://blogstalk.com, I am using only canonical links on the individual post pages, but I don’t have any on the home page. I refrain from putting them there, because I have only seen one canonical link per page on the examples that I have encountered. Is it possible to put multiple canonical links to specify that the individual post pages are the ones that should be canonicalized?

    30. I am also confused with this tag, the all in seo plugin in wordpress automatically creates canonical url of all the urls of the post. I wonder if this creates some problem.?

    31. Arnaud says:

      Hello,
      I’ve got a big problem with canonical link,

      on a product page, I’ve got a “sort by” tool, with an url parameter, so I’ve put a canonical link to point to the url without any ‘sortby’ parameter,

      In GWT, I’ve got 17000 duplicate titles errors , with the 2 pages: with and without the ‘sortby’ parameter !
      Example :
      http://www.meilleurmobile.com/forfaits/priceByOperator.do?mobileId=1915&operateur=Simplicime&sortby=packProductDrillDown
      and
      http://www.meilleurmobile.com/forfaits/priceByOperator.do?mobileId=1915&operateur=Simplicime

      containing the tag

      Have I done anything wrong , or is it a major GWT bug ?

      Thanks for help !

      - Arnaud

    32. Adam Audette says:

      @Arnaud it may just be that Google hasn’t yet crawled and indexed the sort by pages.

    33. Adam Audette says:

      @Arnaud – oh I see – you’re speaking of GWT data. GWT may report on dupe title tags regardless of the appearance of rel=canonical on those pages, but I’m not sure.

    34. @Arnaud it sounds like a glitch. I’ve found that as long as you’re not feeding the ULRs in question via the XML sitemap you shouldn’t see any duplicates in the HTML Suggestions of GWT.

      While I have seen other discrepancies in GWT, I’ve found that Google has been pretty good at recognizing ULRs with the rel=canonical tag and deals with them appropriately.

    35. Arnaud says:

      Hi,
      thanks for answering,
      hm. very strange then, that GWT show these discrepancies.
      Very annoying too, because I’m completly blind to real errors.

      I’ve been waiting a few weeks to see if this ‘glitch’ would diseapear, but with no success.

      Would it be a good idea to add a ‘noindex” metatag on lists with an additional “sortBy” parameter ?

      - Arnaud

    36. Arnaud says:

      Other example :

      http://www.meilleurmobile.com/mobiles/mobile_card.jsp?workMobileId=1768&withoutPrice=1

      and
      http://www.meilleurmobile.com//mobiles/mobile_card.jsp?workMobileId=1768

      2 variantes of the same page (with and without price display), so I really think the ‘canonical’ is meaningfull here,

      but seen as a “duplicate title” in GWT ..

      - Arnaud

    37. Hi Arnaud,

      To start, I would change the path of your rel=canonical implementation from Relative path to Absolute to eliminate this as error. Google says both are allowed but I find it best to keep Google on a short leash and prevent any options for errors here.

      I’d also ensure you stick to a single slash after the domain name so that you don’t create additional problems for yourself. For example all of the following pages resolve to the same page:

      http://www.meilleurmobile.com/mobiles/mobile_card.jsp?workMobileId=1768

      http://www.meilleurmobile.com//mobiles/mobile_card.jsp?workMobileId=1768

      http://www.meilleurmobile.com///mobiles/mobile_card.jsp?workMobileId=1768

      and so on….

      Please let me know how it goes. You can leave a message here or DM me o Twitter at @petryshen

    38. Phil says:

      I have also mucked up my canonical tags, and suffered a big drop in G traffic. I had pointed my canonicals to non existent pages.

      I fixed the problem about 2 months ago and still no recovery.

      Does anyone have any experience as to how long it might take to come back.

      Should I set up 301 redirects for all those pages that don’t exist? pointing them back at the correct page?

    39. Osny Santos says:

      I made the same mistake.
      I put canonical tag at all pages pointing to home page.
      We’ve about 40.000 pages indexed at Google.
      4 days after this wrong change, Google was indexing about 1.500 pages.
      Oct. 12, I corrected the mistake but still with 1.500 pages at Google SERP.
      I put “faster” at Crawl rate in Google Webmaster Tools
      I am waiting. Any suggestions?

    40. Trevor Tessier says:

      With automated canonical plugins for WordPress and Joomla, do you have any knowledge about whether or not there are ones to avoid or ones that do the job right?

    41. #38 – If the drop in G was entirely due to the canonical tag, you should have seen some recovery within a few weeks. Though I must admit using a canonical to a non existing page is not one I’ve heard until now.

      Yes, setup a 301 from the non existent pages to the correct ones and then resubmit your XML to Google Webmaster Tools.

      #39 – Delete your link in Webmaster Tools to the current XML sitemap. Leave it down for a few days and then re add it. I’d also recommend you chop your XML sitemap into smaller recognizable chunks (this will need to be automated) so that you can better determine what is and is not getting indexed.

      If you don’t see the number climb in Webmaster Tools, there are likely other issues at play.

      Cheers… tom @petryshen

    42. embarrassed anbd broke webmaster says:

      It took exactly 6 months to the day for my indexing issues with Google to resolve. I’d say the “penalty” for the wrong canonical tag works EXACTLY the same way that using the URL removal tool work(ed). It’s out of the index for 6 months before it’s re-crawled.

      But that’s just a guess based on my only experience.

      Traffic returned over the course of a few days just as it had disappeared, but only to about 80% of the pre-error level. Over the next 6 weeks the other 20% returned at a fairly consistent rate of about 4 or 5% per week….

      80% loss in traffic was horrible — but it could have been worse. I had made just a handful of my tens of thousands of dynamic pages into static pages for performance reasons, so they weren’t affected. They were the most popular pages and generated substantial revenue and they carried me through with “just” a 50% loss in revenue.

      Good luck to all others who have been bitten by this issue, or who have bitten themselves. However you prefer to look at it.

    43. Eric says:

      Hi everyone, I’m been scouring the web for a tool that will alert me if a canonical tag is added to my pages or if a canonical tag is being used incorrectly. Do you know of any such tools?

    44. Adam Audette says:

      @Eric – you would need a crawler for this. My company has crawling software that will allow for this, but it requires analysis because there isn’t an easy way for a machine to know when a canonical tag is ‘wrong’.

      You might check out IIS SEO Toolkit, it does show excellent canonical information by URL, although I’m not sure on the specifics of its meta tag handling.

    45. J says:

      Just a quick question – why don’t you have canonical tag in your blog… just wondering?

    46. Adam Audette says:

      @J – we actually don’t care much about the SEO on our own site… too busy helping clients! It’s the old case of the “shoe cobbler’s children” if you know what I mean…

    47. Thomas says:

      Been there done that! I wondered why after a website redesign for a car dealer, why the website was not getting indexed?

      Turns out the canonical was bad!

      Another time with a large million page website, the canonical was set to point to the home page on every single inner page.

      IT person did not understand canonical. Result was a huge drop that took months to recover from!

    48. Worried Guest says:

      Hi, this mistake was done by SEO and designers from an outsource company to our site, so all the products were not read by Google. How much damage during the past 4 months do you think it has done as our sales are down 70% since having this canonical work done and it was me that spotted it to the SEOs

    49. Karen says:

      My developer is not very SEO savvy, but I had him incorporate the rel=canonical on my site to show http://www.mysite.com as the canonical. What I did notice is the format is slightly different then what google states. Example: google states to add /> at the end of canonical. My developer is missing the / . Is this a problem and should I have him fix it? This is the current format on my site:
      …” />

    50. Karen says:

      My developer created a canonical that is missing the / at the end of the canonical, is this okay?
      Example… is what currently is on the site

    51. Karen says:

      Canonical on my site ends with …com” >… is this okay? Notice: the / is missing after the “