What Do You Really Know About Canonicals Anyway?
Researching the definition of canonical, you will find the following:
The above screenshot is an interesting definition because that result is served from OxfordDictionaries.com, however Merriam-Webster.com contains the following definition: “conforming to a general rule or acceptable procedure,” which I think is much more relevant.
The rel= “canonical” tag is an HTML element that the major search engines publicly announced support for in February 2009 in order to show the relationship between URLs and to designate a preferred version of a duplicate URL. Thus the canonical tag helps to clean up duplication and can be used on duplicate pages and near duplicate pages.
The canonical URL is the preferred or dominant URL. The rel=”canonical” tag allows Search Engines to understand what should be the ranking, indexed page, and are essentially treated as a soft 301 redirect, passing equity to the preferred URL. If canonical tags are properly implemented, all duplicate, non-canonical URLs will be dropped from the index.
The two duplicate URLs below are duplicates of the preferred URL http://www.example.com:
The duplicate URLs would contain the following canonical tag referencing the preferred URL:
<link rel=”canonical” href=”http://www.example.com” />
The canonical tag can also be helpful for cleaning up duplication due to tracking or sorting parameters appended to URLs.
For example, if your subcategory page for black jeans can be sorted by price, you can use canonical tags to avoid duplication and extraneous URLs within the index.
http://www.example.com/black-jeans/ would be the preferred URL but users and Search Engines can end up on the URL sorted by price:
The above price sorted URL would have the following canonical tag signaling the preferred ranking URL:
<link rel=”canonical” href=”http://www.example.com/black-jeans/” />
How to choose the preferred canonical URL?
There are two important variables that need to be evaluated before selecting a canonical URL: the equity a URL contains and whether it is linked to internally.
The preferred canonical URL should be the URL that contains the most equity, as it will have more ranking potential. Usually the URL with the most equity is also the URL that is internally linked to through navigation, however this is not always the case. We frequently see client sites that link to non-canonical URLs within navigation (more on this below).
*A good tool to use to evaluate equity of a URL is Open Site Explorer. It can be a free tool for a few URL evaluations, but after that you’ll need an account.
Let’s use the Sesame Street site as an example. They have duplicated home pages with the following URLs:
We can see the split equity between these two URLs with the screenshot below:
SesameStreet.org contains a canonical tag on http://www.sesamestreet.org/home pointing to http://www.sesamestreet.org (although it is funky coding – call us :) The canonical tag signals to search engines that http://www.sesamestreet.org is the preferred version, indexing only that URL and consolidating all link equity onto that URL.
Internally Linking to Canonical URLs
As briefly mentioned above, we see a lot of sites that do not link to canonical URLs. Internally linking to canonical URLs helps to consolidate link equity and send quality signals to search engines. Take the Sesame Street example above and look at their internal link numbers for the two URLs:
Although the http://www.sesamestreet.org is the canonical or preferred version, they are internally linking to http://www.sesamestreet.org/home. The internal signals (internal links) are almost all pointing to the non-canonical version of their homepage. Updating internal linking to point to the canonical URL, consolidates equity and sends consistent signals, strengthening the URL overall.
Cross-domain Canonical Tags
Cross-domain canonical tags are exactly as they sound, referencing a relationship of URLs across another domain.
Some online publishers, including RKG, will reference content that has been posted elsewhere. The best way to avoid duplication or being penalized for publishing copied content is to implement cross-domain canonical tags.
These tags are no different; they just reference the preferred URL on a different domain. For example, George Michie contributes to SearchEngineLand.com and promotes his articles on the RKG blog. Both sites were indexed and ranking for the same content. We implemented cross-domain canonical tags and the RKG blog posts were dropped from the indices. Seems like a loss for RKG, which from an organic search perspective it slightly was, however we didn’t really own the content, SEL did, and they should be the preferred, dominating URL for the content.
For example, if you look at the canonical tag found on the following blog post:
you will see it points to the SEL URL:
<link rel=”canonical” href=”http://searchengineland.com/how-would-you-create-the-perfect-search-engine-104253″ />
HTTP Header Implementation
In some cases it is difficult for platforms to adjust the <head> of a page. If implementing rel=canonical tags within the <head> of a page is not an option for your platform, then an alternative solution is to implement the canonical tag within the HTTP header like this:
Link: <http://www.example.com/preferred-version>; rel="canonical"
This is also great for .pdf files and other file types that can be indexed by search engines.
Comments and Questions
We would love to hear your comments and questions regarding canonical tags. Please leave them below.