What is a canonical tag?
A canonical tag (or rel=canonical) is a small piece of HTML code that helps search engines to determine the “main” version of the page from the rest of the pages that are identical or very similar to it.
In SEO, canonical tags are used to let Google know which version of the page you want to appear in search results, to consolidate link equity from the duplicate pages as well as to improve crawling and indexing of your website.
Here’s what a canonical tag can look like on the webpage:
<link rel="canonical" href="https://mangools.com/blog/robots-txt/" />
Why are canonical tags important in SEO?
The primary purpose of the canonical tag is to tell search engines which page is the main, original version and which are just duplicates that look the same.
Generally speaking, websites usually contain at least some pages that are considered duplicates – they display the same content but with different URLs.
In these instances, Google has to decide which page to choose for indexing and ranking purposes – it won’t use all the pages as search results since they all look identical or just very similar.
For example, product pages are usually displayed not only by 1 main URL. They can be also displayed with various URL parameters that are often used (e.g. for sorting, currency, sizes, etc.):
https://www.randomshop.com/clothes/shirts.html https://www.randomshop.com/clothes/shirts.html?Size=XL https://www.randomshop.com/clothes/shirts.html?Size=XL&color=red
In this example, the product page can be displayed in the main category –
/clothes/, but also be filtered and displayed with size and color parameters. Therefore it can be displayed as a search result under 3 different URLs.
This is where canonical tags became important – they will indicate to Google that you want to index the main URL category
/clothes/, use it as a search result and ignore the rest of the URLs.
Note: Keep in mind that Google perceives canonical tag as a signal – not as a directive.
If there are valid reasons to choose another page for indexing and ranking purposes rather than the canonical one, the search engine might ignore the canonical tag altogether:
Or as Martin Splitt stated:
“All right, let’s start with the idea that it is a directive because it’s not.”
Besides the fundamental purpose of the canonical tag, there are also some important SEO benefits that come with it.
1. They consolidate PageRank
Canonical tags help to consolidate link equity (PageRank) from all duplicate pages into the one main, canonical page.
Duplicate pages can often obtain backlinks from various external sources – whether they are backlinks from random websites, users on social media, etc.
These pages therefore partially take over the link equity from the main version of the page – the one that you actually want to rank as a search result.
By implementing canonical tags on the duplicate pages, PageRank can be transferred into a single URL and therefore improve its overall ranking in Google Search.
2. They help manage syndicated content
Canonical tags can tell the search engine which website contains the original version of the content and which websites just republish it (or syndicate).
Many website owners use other websites for publishing their content (either for promotional or other purposes).
In this case, Google has to decide which website is the original source of this content and should be displayed as a search result and which websites just promote it.
Setting up canonical tags on these external websites helps to resolve this problem and promote the original, main version of the page in Google Search.
Or as Danny Sullivan stated:
If people deliberately chose to syndicate their content, it makes it difficult to identify the originating source. That's why we recommend the use of canonical or blocking. The publishers syndicating can require this. https://t.co/hblGLsD0ir pic.twitter.com/yjtx43II8j
— Danny Sullivan (@dannysullivan) September 18, 2019
3. They improve crawling
Canonical tags help search engines like Google to efficiently crawl pages that you actually want to crawl and index – as opposed to duplicates that should not be crawled at all.
Duplicate pages waste Google’s resources and time as they are not important for crawling or indexing purposes.
By appointing canonical pages, Google will focus more on pages that matter the most and therefore save the “crawl budget”.
Or as Google officially stated:
“The canonical page will be crawled most regularly; duplicates are crawled less frequently in order to reduce Google crawling load on your site.”
How to add a canonical tag?
Adding canonical tags to your pages is pretty easy – just go to any duplicate webpage and add rel=”canonical” tag into the <head> section of the page.
The link in the canonical tag should be pointing into the main, original version.
Implementing canonical tags is best done on a page-by-page basis. However, this can consume a lot of time and resources or be even impossible on larger websites.
Fortunately, canonical tags can be also implemented automatically by using various plugins such as Yoast SEO (for WordPress).
Implementation of canonical tags via this plugin is pretty straightforward:
- Choose the page for canonicalization
- Head over to the “Advanced” section of the page
- Add the canonical URL that you wish to refer to
There are also a few other ways how you can indicate to Google your canonical pages.
Use HTTP header
Canonical tags can be also added in the HTTP header of the webpage.
This is especially useful for special non-HTML documents such as PDFs – since they don’t contain any
<head> section where you could add a standard canonical tag.
For implementing canonical tags into the HTTP header, you need to access the
.htaccess file of your site and add the canonical tag in to form that can look like this:
Link: <https://www.yoursite.com/random-document.pdf>; rel="canonical"
If you would like to learn more about adding canonical tags via HTTP header, check out this article about the implementation of canonicals.
Tip: There are also a few other ways how you can tell the search engine about pages that you wish to be canonical versions:
- Sitemap – Google can automatically assume that all the URLs listed in the Sitemap are the main, canonical versions
- Redirect – duplicate pages can transfer traffic as well as all page signals into the single, canonical URL via 301 redirects
- Internal linking – Google can easier determine which pages are canonicals if internal links within your site are pointing to them from duplicate pages.
- HTTPS – search engines like Google usually prefer pages as canonicals that have a valid SSL certificate (as opposed to pages without encryption – HTTP).
Canonical tag best practices
1. Use self-referencing canonicals
Although it is not mandatory, it is always a good practice to add a canonical tag on a page that points to itself – even if you did not use canonical tags on the rest of the duplicate pages.
rel=canonical on the main, original pages gives search engines like Google a clear signal that they are canonical versions:
“I recommend doing this kind of self-referential rel=canonical because it really makes it clear for us which page you want to have indexed or what this URL should be when it’s indexed.” (John Mueller).
2. Use absolute URLs
Absolute URLs in canonical tags can help you avoid unintentional mistakes or bad interpretation of canonical URLs by a search engine (as opposed to the relative URLs).
Absolute URLs should also include
www, and trailing slashes (if possible).
Here is an example of the absolute URL in canonical tag:
<link rel="canonical" href="https://www.randomwebsite.com/randompage/" />
And here is an example of just relative URL:
<link rel="canonical" href="/randompage/" />
3. Use lowercase URLs
Search engines like Google can be sensitive about the upper and lower cases in the URLs.
Using lower cases in canonical URLs can therefore help you keep consistency and avoid duplication issues in the eyes of search engines.
As a good practice, try to use lower case in URLs on your servers as well as apply them to the canonical tags.
4. Canonicalize cross-domain duplicates
Canonical tags can also reference your main pages from other domains – not just from your website.
If you have duplicate content present on pages on a different website (e.g. repurposed post on some news site), you should:
- use the self-referencing canonical tag on your original page
- apply the canonical tag on the external page, referencing your original one
What to avoid with canonical tags?
1. Multiple canonicals on 1 page
Pay attention to the multiple canonical tags that might occur in the HTML of a page by accident.
Although rare, having more than 1 canonical tag on a page can create confusion for the search engine and result in ignoring this canonical signal.
Or as Google officially stated:
“In cases of multiple declarations of rel=canonical, Google will likely ignore all the rel=canonical hints. Any benefit that a legitimate rel=canonical might have offered will be lost.”
2. Avoid canonicals on non-duplicates
Always make sure that the content on the duplicate pages and the main version of the page is either identical or at least nearly similar when applying canonical tags.
Implementing canonical tags on pages that are completely different might confuse search engines or be completely ignored:
Or as Martin Splitt explained:
“… if the content is completely different or different enough for the algorithms to decide that this is not a duplication, then the canonical is pointless.”
3. Canonicals on paginated pages
Paginated pages contain fragmented content across several different pages (e.g. comment section on the website divided into pages “1”, “2”, “3”).
In this instance, you should always use self-referencing canonical tags on every individual page – and not refer to page “1” from the rest of the paginated pages:
“The main thing to avoid, since this post is about canonicalization, is to use the rel=canonical on page 2 pointing to page 1. Page 2 isn’t equivalent to page 1, so the rel=canonical like that would be incorrect.” (John Mueller)
4. Don’t block canonicals via robots.txt
You should never block URLs with canonical tags by robots.txt file.
Robots.txt will prevent Google from crawling the duplicate pages – therefore it will be unable to see the canonical tag referencing the main version of the page.
Furthermore, blocking URLs that contain canonical tags will also prevent PageRank to be transferred into your main versions.
5. Don’t use canonical in the <body>
Canonical tags should be always applied in the
<head> section of your pages – not in any other places in the HTML document.
Google will simply ignore your canonical tags in the
<body> section or in any other place.
6. Avoid canonical loops and chains
You should always try to use canonical tags referencing directly to the main page in order to avoid canonical loops (similar to the redirect loops).
For example, using a canonical tag from page A to page B and then from page B to page C will create a canonical chain that can confuse search engines and waste their resources and time.