This article is part of our series in understanding search engine optimization (SEO). Here we look at the subject of the canonical url.
Did you know that we can sometimes mistakenly create web pages that have many URLs referencing it? This is not a good thing. To avoid search engine duplicate content penalties and rank higher, you need to give them a hint.
How you do this is through the use of a canonical url link tag.
Where Canonical Url Is Needed
Before we look at how to use this aforementioned tag, it will be good to understand the problems that can arise when publishing.
The Home Page
The most common case is in your home page. Given these URLs to your domain, which is the one that best represents it?
// Example #1: Home page http://www.yoursite.com http://www.yoursite.com/ http://www.yoursite.com/index.htm http://www.yoursite.com/index.html http://www.yoursite.com/index.php http://yoursite.com http://yoursite.com/ http://yoursite.com/index.htm http://yoursite.com/index.html http://yoursite.com/index.php
Search engines need to know your preferred URL. They don’t want to store multiple documents in their database. It takes up disk space, memory, CPU cycles, and leaves their search results scattered with references all to the same page. They use this as a SEO signal for ranking your page too.
In the home page case, most search engines know what is going on. They’ll pick one of these – perhaps incorrectly. In other cases, it may not be so straight forward, but they have to choose. So you have to help them.
Take a look at this scenario where the intent is to treat the category as an index page. Again, which is the best one to use?
// Example #2: Category page http://www.yoursite.com/category http://www.yoursite.com/category/ http://www.yoursite.com/category/index.htm http://www.yoursite.com/category/index.html http://www.yoursite.com/category/index.php http://yoursite.com/category http://yoursite.com/category/ http://yoursite.com/category/index.htm http://yoursite.com/category/index.html http://yoursite.com/category/index.php
Things start to get a little more interesting when we move away from an index page. In this example below, we can represent an article in many different ways:
// Example #3: Article Page http://www.yoursite.com/title-of-article http://www.yoursite.com/title-of-article/ http://yoursite.com/title-of-article http://yoursite.com/title-of-article/
URLs With Parameters
A website can also have page URLs with parameters in them:
// Example #4: Parameters Page http://www.yoursite.com/oranges.php?t=navel http://www.yoursite.com/oranges.php?t=mandarin http://www.yoursite.com/oranges.php?t=navel&box=1 http://www.yoursite.com/oranges.php?t=mandarin&color=green http://yoursite.com/oranges.php?t=navel http://yoursite.com/oranges.php?t=mandarin http://yoursite.com/oranges.php?t=navel&box=1 http://yoursite.com/oranges.php?t=mandarin&color=green
Lets not forget about pagination for categories:
// Example #5: Pagination Page http://www.yoursite.com/sports http://www.yoursite.com/sports/page/1 http://www.yoursite.com/sports/page/2 http://yoursite.com/sports http://yoursite.com/sports/page/1 http://yoursite.com/sports/page/2
Article, Posts, and Pages Pagination
And pagination for articles, posts, and pages:
// Example #6: Article Pagination Page http://www.yoursite.com/sports/title-of-article http://www.yoursite.com/sports/title-of-article/1 http://www.yoursite.com/sports/title-of-article/2 http://yoursite.com/sports/title-of-article http://yoursite.com/sports/title-of-article/1 http://yoursite.com/sports/title-of-article/2
The Canonical Reference Link
With this many ways to represent one HTML resource through so many URLs, a search engine can become confused about which is the preferred object and have to make a choice without your control. You just may find your page not put in their search index and that would be not a good thing!
So how do you tell the search engine which URL to use? You do this by using a relative link tag inside all your HTML documents:
// Home page <link rel="canonical" url="http://yoursite.com"/> // Category page <link rel="canonical" url="http://yoursite.com/category"/> // Article page <link rel="canonical" url="http://yoursite.com/title-of-article"/> // Parameters page <link rel="canonical" url="http://yoursite.com/oranges.php"/> // Pagination page <link rel="canonical" url="http://yoursite.com/sports"/> // Article pagination page // Each is it's own canonical url <link rel="canonical" url="http://yoursite.com/sports/title-of-article"/> <link rel="canonical" url="http://yoursite.com/sports/title-of-article/1"/> <link rel="canonical" url="http://yoursite.com/sports//title-of-article/2"/>
If you are building static HTML pages, this can be quite a chore. As you go along creating pages related to one another, you have to remember what canonical url you are using. Content Management System (CMS) alleviate these types of problems. They provide a mechanism to supply the canonical url in its interface.
Pagination is better done with the next and prev elements. For the first page (/), it will look like this:
<link rel="next" url="http://yoursite.com/sports/title-of-article/1"/>
For the second page (/1), it will look like this:
<link rel="prev" url="http://yoursite.com/sports/title-of-article"/> <link rel="next" url="http://yoursite.com/sports/title-of-article/2"/>
For the last page (/2), it will have this:
<link rel="prev" url="http://yoursite.com/sports/title-of-article"/1>
If you run a WordPress blog, you must have everything right. The WordPress Site Address Address (URL) in your settings is used as your website’s base URL.
If it is set to this:
And someone surfs or backlinks to:
As a result, they will be redirected to the preferred URL via a HTTP 301 redirect.
How do you set the canonical URL for your WordPress post? Use a WordPress SEO plugin.
Yoast SEO will automatically generate the link tag if you don’t supply it in the snippet editor. If you use permalinks, you don’t have to do a thing. Yoast will automagically create the tag for you.
Yoast SEO also has a feature where you can turn off category and tag archive pagination from being indexed (i.e. noindex). Only the first page will be used.