Google is not the first search engine that surf the Internet. But Google does things in a better way and provides useful results. Google hasn’t stopped expanding as a company since that very day.

Though SEO is done for all the search engines that exist today, one major search engine that handles most of the traffic on the Internet is Google. Whenever we talk about SEO, people automatically assume that we are talking about optimizing the website for Google.

When it comes to SEO, we need to check many factors, both onsite as well as offsite. But if your onsite SEO is not up to the mark, no matter how well you do for your offsite SEO, you will not get the results you are expecting.

URL Canonicalization

The term Canonicalization can be tough to understand. Let’s say there are two URLs of a website:

  • http://xyz.org
  • http://www.xyz.org

Both of these pages show content, and none of these pages redirects to any one of them. This can result in duplicate content issues on Google, and you can face penalties.

Let’s see one more example. There are two URLs on a website that result in the same page resolution.

  • http://xyz.org
  • http://xyz.org/index.php

If both of these web pages show the same result, then this may cause an issue as well!

You may not pay much attention to this issue, but this may result in serious duplicate content penalties. The problem with search engine bots is that they can’t decide which version of the URL they should add in their index. If two pages are resolving the same content, they will assume one is a copy of the other and your website will get penalized.

If your site is opening on 2 URLs showing the same content, then you must have to fix it. You must use server settings so that whether a user opens with www or without www, the site will open on any of the one version. In this way, you can fix the canonicalization.

Though at times you would like to share some content on two URLs, then you can use rel=”canonical” tags to let a search engine know that which is the original and which one is a copy of it. This can save you from duplicate content penalties.

How to apply URL Canonicalization?

Let us now check how to apply URL Canonicalization. We don’t need to type in lines of code to do it. A simple rel=”canonical” tag is enough to apply Canonicalization.

Take an example, there are two URLs on the website that result in the same content when they resolve. These two URLs are:

  • http://xyz.org
  • http://xyz.org/index.php

HTML Canonicalization

The second URL results in the same content as the first URL. They both are displaying the same page and hence you can apply the rel=”canonical” tag, in this case, to indicate that the URL with index.php is a Canonical URL of the first one.

This is how it is applied.

<link rel=”canonical” href=”http://thewebpage.org/index.php”>

HTTP Header Canonicalization

The above markup can be used in the case of HTML content but what if we are dealing with a non-HTML content like a PDF document or anything like this, In that case, we can use HTTP Header Canonicalization.

> HTTP/1.1 200 OK

> Content-Type: application/pdf

> Link: <http://www.example.com/white-paper.html>; rel=”canonical”

> Content-Length: 785710

When Canonicalization should be used?

Now that you know what exactly Canonicalization means, you can move forward on the topic and see when should you use Canonicalization. Because there are many more cases other than these two I have mentioned in the examples above.

Here are a few conditions that can be prevented with proper URL Canonicalization.

  • Different URL for one same content
  • Various categories and tags that result in the same content
  • Mobile website displaying the same content but on different URL/subdomain
  • URLs having HTTP and HTTPS URLs and both resulting in the same content
  • Various ports
  • When a website has a www and a non-www version
  • In case of sharing syndicated content

These are some major conditions in which we can apply URL Canonicalization to save our site from facing any kind of duplicate content penalties.

When URL Canonicalization should NOT be performed?

There are scenarios in which we should not perform URL Canonicalization, and this section is targeted towards specifying these particular conditions. You can also consider these as errors when it comes to URL Canonicalization.

Skipping pagination canonicalization

If you are planning to canonicalize paginated URLs, then you have to know that this is a very bad idea. You can not add a canonicalization tag on the second page of a URL as that URL will not be indexed at all by Google.

Multiple Canonical tags

If a web page has multiple rel=”canonical” tags, then it can be harmful to your website. Make one specific tag and make it clear which one you prefer.

I have seen that many people apply the Canonical tag like this:

<link rel=”canonical” href=”index.php”>

This style of canonicalization is an invitation to a lot of errors in your page. You need to understand that the more complete your canonical markup is, the better it will be for you and your content.

<link rel=”canonical” href=”http://thewebpage.org/index.php”>

The above markup is a better way to apply canonicalization.

Canonicalization on the mobile version of websites

Just a canonical tag to differentiate a mobile website on the subdomain of your main website is not enough. Google suggests that you use both rel=”alternate” as well as rel=”canonical” to mention that the URL is for displaying the mobile version of the website.

Here is how you can implement it:

> <html>

> <head>

> <link rel=”canonical” href=”http://example.com/” >

> <link rel=”alternate” href=”http://m.example.com/” media=”only screen and (max-width: 640px)”>

> </head>

> <body>

Do not use a Canonical tag outside of <head>

Search engine bots will totally ignore the tags that are set outside the <head> are of the website so in order to apply a proper canonical tag, you need to add it between <head></head>.

Do not use multiple Canonical tags on your website

Using multiple Canonical tags is pointless. Search engines will ignore both of the tags and you will face weird SEO behavior and issues on your website. Multiple canonical tag URLs are sometimes caused due to plugin glitches so you might have to keep an eye on that.

Do not point a Canonical URL to a website with a non-200 status code

A website with a code like 301 and 302 will force the search engines to crawl one extra URL and this means that they need to crawl two URLs at once. This adds up to a big amount and it can easily deplete your crawl budget.

Do not use Canonicalization for PageRank Sculpting

PageRank is no more a public entity or statistic to a website but it is still considered by the search engines. If you are planning to use Canonical tags for PageRank sculpting and to get better ranking, let me make it clear that it will do more harm to your website than good.

Conclusion

The concept of onsite SEO is much bigger than that it looks. You need to take care of many things at once, and you also need to keep yourself updated with the changes that take place in everyday time.