We went deeper into duplicate content, which is an important factor in canonicalization, but we wanted to give you other factors that also play a role. In this post, we’ll go into sitemaps, page links, URL redirects,
Our hope is that after this, you will understand a bit better why your private label SEO company seems to always want to fix something on your website!
Whatever URLs you have on your sitemap make a difference vis-a-vis canonicalization. They can signal which pages Google should prioritize based on their location or rank. In your sitemap, you should really only include the pages that you want Google to index.
However, there can be times when you would include other pages because you need them crawled. For example, if you just moved your website to another domain or changed the categories on some pages, you would think that you only need an updated sitemap. The truth is that you should still have a live sitemap to reflect the old pages as well, even if they are not canonical. This is because it helps recognize redirects so that the changes can be reflected faster. After all the changes are on record, you can delete this sitemap.
The way you go about your internal linking can affect how Google considers which version of your site is canonical.
When you link to supplemental resources in your content, we recommend linking to your preferred canonical version of those pages. Remember to update any URLs whenever needed to maintain the consistency of your preferred canonical. Note that if you encounter any conflicts that make you choose between the user experience and standard SEO practices, you should always prefer to serve the user.
External linking also has an impact on canonicalization. No one has control over how other websites link to your pages. However, if you can get them to update those links to the latest ones you have live, it will help a lot. This is because the linked version signals to Google what page you want them to index.
As the name implies, URL redirects are a means of diverting traffic from one URL over to another. Your URL redirects will signal to Google what your preferred canonical is. These pages pass PageRank and help Google to choose which page they will show on the index.
Some most used URL redirects include:
301 redirects – These are redirects used in cases where you want to migrate your website permanently smoothly and have the new page indexed and SEO transferred. In the case of canonicalization, it is used to funnel people from multiple URL versions on the same site to the “master copy” of your site.
302 redirects – These are the temporary version of 301 redirects. After enough time has passed, or if you redirect a page to a URL that already exists and is already indexed, search engines can look at 301s as permanent redirects. This means that they will send signals forward. When enough signals show the new URL versus the old URL, the scales are flipped. This happens with both internal and external linking.
307 redirects – These are also temporary redirects but they are more specific than 302 redirects. This is because they are explicitly used to tell Google that the URL is temporarily relocated.
- Meta and HTTP refresh 0
- HTTP 301 AND 308
- Crypto redirect
- Meta and HTTP refresh >0
- HTTP 302, 303, and 307 (server-side)
Note that Google will normally consider consolidation to be permanent after a year has passed. Temporary redirects that are removed or changed before this time will be treated as temporary. If removed or changed after a year, the signals will remain pointing to the page where the URL was redirected. If you return to the original page, new signals will go to this restored page, and the old ones will consolidate at the redirected page.[bctt tweet=”The way you go about your internal linking can affect how Google considers which version of your site is canonical.” username=”ThatCompanycom”]
Checking the Canonical
The URL Inspection tool in Google Search Console allows you to check the indexed version of your page. You can find out what Google selected as your canonical page under Page indexing.
Another way you can check which version is canonical aside from Search Console is to copy and paste the URL into Google and see which page shows up as the top result if the cached version differs from the one displayed, that indicates that Google has selected a different canonical.
Note that the Google search engine results pages will display the pages that Google is aware of. This does not guarantee that you will see all the indexed pages or the pages that were determined to be canonical.
If some form of human error or oversight has occurred regarding your canonicalization efforts, it isn’t the end of the world. Google’s AI is still pretty smart in its own right, and there is a chance they will be able to understand which page looks like the one you prefer to be canonical.
What Not to Do
Here are some common errors that website owners make when dealing with canonicalization.
A robots.txt file is designed to keep a search engine from indexing certain pages, content, or URL. A blocked URL is no longer crawlable, meaning Google cannot see that page or any canonical tags.
Noindex is a tag that tells Google to skip crawling that page. This is usually done so that Google won’t show it on search results. Naturally, you don’t want to put this tag on your preferred canonical. Even if you do, Google usually prioritizes the rel=canonical tag over noindex, however, you should still check to be sure.
4XX HTTP status code
A 4XX HTTP status code shows a webpage is unreachable. Setting this up would be the equivalent of noindex but could negatively impact your site ranking if Google sees that the webpage is unavailable if it is a page with good rankings.
Google Search Console URL Removal
You may think that you can get rid of the other unwanted versions of your URL and keep the few that you deem the most important, including your canonical version. However, what most people don’t realize is that this can delete all versions of the URL and not just selected ones. When this happens, you are deindexing your page.
Because Google looks at multiple signals, if you want to tell them which URL is your preferred one, you must keep these consistent across your website. Otherwise, Google will select a canonical that may not be the URL you worked hard to build.
More Than One rel=canonical Tag
Canonical Tag Location
The canonical tog or rel=canonical is a code that should be input into the <head> part of a page or in the HTTP header to be seen when Google crawls a page. If this tag appears in the <body> section of a page, then it will be as if it had not been added at all.
Google offered some tools in the past that we could use to help manage canonicalization, like the Google Search Console Preferred Domain setting and the URL Parameters Tool. Even though these are no longer available, you can still look to the signals mentioned in this post, along with others, to inform how Google will choose the canonical for your pages.
If you work with an SEO service, make sure they are dealing with any canonicalization issues. If you use white label branding services, you may assume that they take care of all things SEO as well. This is not always the case. Make sure you speak with any company that works on your website about SEO specifically, and get into as much detail as possible about what they do and don’t do. As a white label digital marketing agency, we can handle these SEO issues and your other digital marketing needs.