THE RKGBLOG

What Do You Really Know About Canonicals Anyway?

Researching the definition of canonical, you will find the following:

Definition of canonicalThe above screenshot is an interesting definition because that result is served from OxfordDictionaries.com, however Merriam-Webster.com contains the following definition: “conforming to a general rule or acceptable procedure,” which I think is much more relevant.

The rel= “canonical” tag is an HTML element that the major search engines publicly announced support for in February 2009 in order to show the relationship between URLs and to designate a preferred version of a duplicate URL. Thus the canonical tag helps to clean up duplication and can be used on duplicate pages and near duplicate pages.

The canonical URL is the preferred or dominant URL. The rel=”canonical” tag allows Search Engines to understand what should be the ranking, indexed page, and are essentially treated as a soft 301 redirect, passing equity to the preferred URL. If canonical tags are properly implemented, all duplicate, non-canonical URLs will be dropped from the index.

For example, if your homepage has 3 versions due to your CMS structure or any other technical issue, the canonical tag can be placed within the <head> of pages to point to the preferred URL. Canonical Slim Shady

The two duplicate URLs below are duplicates of the preferred URL http://www.example.com:

http://www.example.com/index.html

http://www.example.com/home.html

The duplicate URLs would contain the following canonical tag referencing the preferred URL:

<link rel=”canonical” href=”http://www.example.com” />

The canonical tag can also be helpful for cleaning up duplication due to tracking or sorting parameters appended to URLs.

For example, if your subcategory page for black jeans can be sorted by price, you can use canonical tags to avoid duplication and extraneous URLs within the index.

http://www.example.com/black-jeans/ would be the preferred URL but users and Search Engines can end up on the URL sorted by price:

http://www.example.com/black-jeans?price50-99

The above price sorted URL would have the following canonical tag signaling the preferred ranking URL:

<link rel=”canonical” href=”http://www.example.com/black-jeans/” />

How to choose the preferred canonical URL?

There are two important variables that need to be evaluated before selecting a canonical URL: the equity a URL contains and whether it is linked to internally.

The preferred canonical URL should be the URL that contains the most equity, as it will have more ranking potential. Usually the URL with the most equity is also the URL that is internally linked to through navigation, however this is not always the case. We frequently see client sites that link to non-canonical URLs within navigation (more on this below).

*A good tool to use to evaluate equity of a URL is Open Site Explorer. It can be a free tool for a few URL evaluations, but after that you’ll need an account.

Let’s use the Sesame Street site as an example. They have duplicated home pages with the following URLs:

http://www.sesamestreet.org

http://www.sesamestreet.org/home

We can see the split equity between these two URLs with the screenshot below:

Sesame Street split equity

SesameStreet.org contains a canonical tag on http://www.sesamestreet.org/home pointing to http://www.sesamestreet.org (although it is funky coding – call us :) The canonical tag signals to search engines that http://www.sesamestreet.org is the preferred version, indexing only that URL and consolidating all link equity onto that URL.

Internally Linking to Canonical URLs

As briefly mentioned above, we see a lot of sites that do not link to canonical URLs. Internally linking to canonical URLs helps to consolidate link equity and send quality signals to search engines. Take the Sesame Street example above and look at their internal link numbers for the two URLs:

Sesame Street internal linking

Although the http://www.sesamestreet.org is the canonical or preferred version, they are internally linking to http://www.sesamestreet.org/home. The internal signals (internal links) are almost all pointing to the non-canonical version of their homepage.  Updating internal linking to point to the canonical URL, consolidates equity and sends consistent signals, strengthening the URL overall.

Cross-domain Canonical Tags

Cross-domain canonical tags are exactly as they sound, referencing a relationship of URLs across another domain.

Some online publishers, including RKG, will reference content that has been posted elsewhere. The best way to avoid duplication or being penalized for publishing copied content is to implement cross-domain canonical tags.

These tags are no different; they just reference the preferred URL on a different domain. For example, George Michie contributes to SearchEngineLand.com and promotes his articles on the RKG blog. Both sites were indexed and ranking for the same content. We implemented cross-domain canonical tags and the RKG blog posts were dropped from the indices. Seems like a loss for RKG, which from an organic search perspective it slightly was, however we didn’t really own the content, SEL did, and they should be the preferred, dominating URL for the content.

For example, if you look at the canonical tag found on the following blog post:

http://www.rimmkaufman.com/blog/the-perfect-search-engine/19122011/

you will see it points to the SEL URL:

<link rel=”canonical” href=”http://searchengineland.com/how-would-you-create-the-perfect-search-engine-104253″ />

HTTP Header Implementation

In some cases it is difficult for platforms to adjust the <head> of a page. If implementing rel=canonical tags within the <head> of a page is not an option for your platform, then an alternative solution is to implement the canonical tag within the HTTP header like this:

Link: <http://www.example.com/preferred-version>; rel="canonical"

This is also great for .pdf files and other file types that can be indexed by search engines.

Comments and Questions

We would love to hear your comments and questions regarding canonical tags. Please leave them below.

Comments
6 Responses to “What Do You Really Know About Canonicals Anyway?”
  1. B. Moore says:

    Can you also use rel=”canonical” for solving a problem with a site leaking Session ID’s in to Google SERPS?

  2. Cara Pettersen Cara Pettersen says:

    Yes, you certainly can :)

  3. G says:

    Hi Cara

    Thank you for a great post… and the answer to what I really know about Canonicals is ‘far less than I’d like to’!

    So, if you’ll excuse the newbie/non-programmer question – how when you have the following selection of urls for the same page:

    http://www..blahblah.co.uk/
    http://www.blahblah.co.uk/index.htm
    http://blahblah.co.uk/
    http://blahblah.co.uk/index.htm

    but only one you can edit:
    index.htm

    What’s the best way to start solving the problem?

  4. Cara Pettersen Cara Pettersen says:

    Hi G, great question!

    My final recommendation would depend on metrics around these URLs, such as looking at which URL contains the most equity/backlinks pointing to it, but just looking at the above duplicate URLs, this is what I would suggest:

    Since /index.html is the only URL you can edit, I would stick with that one as your preferred URL and either 301 redirect the .com version to /index.html or implement a canonical tag on the .com version to point to the /index.html version. Additionally, make sure all internal linking points to the /index.html version.

    I would also choose the www version as the preferred version, unless the non-www version is stronger. Either way, choose one version and be consistent with it throughout the site. I would then 301 redirect all non-www versions to the www versions.

    Hope this helps! And please don’t hesitate if you have any further questions.

  5. WebsiteMBA says:

    I’m using a custom build PHP framework site and been thinking how to implement canonical URLs on it. But now I’m glad to find this post, I will use your HTTP Header Implementation. Thanks for the help Cara!

  6. Cara Pettersen Cara Pettersen says:

    Thanks for your comment @WebsiteMBA! Happy to see the article helped you out. Just a little FYI: You can use other meta tags within the HTTP header as well, such as the meta robots noindex,follow :)