The Truth about Duplicate Content

I’m going to set the record straight on a topic that seems to puzzle a lot of bloggers – the truth about duplicate content.

Chances are that you are reading this because you’re just curious, or:

A. Someone has nicked your blog content and has published it elsewhere

B. You saw some great content and copied a part of it to use in your own blog post and now you’re worried the big G (Google, not God) is on to you

C. You heard that the post you published and gave permission for another site to use could cause the Internet to implode

Well, to all those panicking bloggers: Take a chill pill.

It’s going to be S-E-Ok.

In general there’s a lot of mixed feelings and advice regarding duplicate content knocking around online. I see fellow bloggers ask questions on forums and in Facebook groups about this topic quite a bit.

Their questions are usually met with answers that are not totally wrong, but are way wide of the truth. It’s usually a guess or what they’ve heard, rather than an answer backed up by the facts.

What is duplicate content?

According to Google:

Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin. – Google

In a nutshell: duplicate content on your blog won’t get it slapped with a penalty, and will only hurt your ability to rank in the search engine results pages (SERPs).

However, unlike your neighbour’s old rusty garden gate that blows about in the wind, there is a catch.

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. – Google

Even if the duplication is innocent, if the search engine thinks your blog content is duplicated in an attempt to manipulate search results to gain more traffic, it’ll make you pay for it.

And I don’t mean they’ll invoice you 20 quid (or dollars if you’re across the pond).

It’s possible they might just devalue your blog post containing duplicate content and make it harder for your fresh content to rank in the future. Or worse; in extreme cases your website will get hit by Google’s Panda Algorithm and drop off the face of the planet.

But don’t go and purchase your Panda whacking stick just yet.

How to fix duplicate content on your blog

Over time you will naturally have posts containing similar content, especially if your niche is very narrow. In these instances, search engines might have trouble choosing the page you want to rank.

To make sure search engines know which of your posts to reward for it’s unique and relevant content, make sure you follow these duplicate content SEO best practices:

Provide unique content in each post:

As a blogger or writer, this should be something we strive to do anyway. If you’re stuck for new post ideas, check this out.

Use a Redirect:

This is called a ‘301 redirect’ and is the best way to help Google find the right post to rank.
Identify the original post you want Google to use and redirect the traffic from the duplicate posts to it. The 301 permanently redirects traffic, so in essence you’re getting rid of the duplicate posts.
You can easily apply a 301 redirect in WordPress using Yoast’s WordPress SEO plugin. Go to the post you wish to redirect and click on the ‘Advanced’ tab.

Yoast WordPress SEO Plugin for 301 redirect or canonical URL

Use a Canonical tag:

If you don’t want to get rid of the duplicate content but want to comply with best practices, then use a Canonical link to your duplicate content.
This doesn’t redirect traffic like a 301 does, so the post is still viewable. Instead it just tells Google that it should index the post the Canonical link points to, and not this version. Sort of a soft 301 redirect.
You can easily apply a Canonical link in WordPress using Yoast’s WordPress SEO plugin. Go to the post you wish to link from and click on the ‘Advanced’ tab.
To do this manually, add a <link> element with the attribute rel="canonical" to the <head> section of the post:

<link rel="canonical" href="https://blog.example.com/example/example-post" />

Consolidate posts:

Say you have two posts talking about a yearly event and a lot of the info is the same. You could combine the content from both posts to create one post about the event, and then 301 redirect the posts to your new consolidated post. You could also try expanding each post, making them more unique.

How to deal with duplicate content across multiple sites

Duplicate content from guest posting:

Google is quite clear on this topic.

If you syndicate your content on other sites, Google will always show the version we think is most appropriate for users in each given search, which may or may not be the version you’d prefer. However, it is helpful to ensure that each site on which your content is syndicated includes a link back to your original article. You can also ask those who use your syndicated material to use the noindex meta tag to prevent search engines from indexing their version of the content. – Google

So basically if the post appears on your blog first, make sure you get a link back from the republished post.

In my opinion asking the site owner or editor to noindex their version of your content isn’t going to go down well. A good compromise is asking them to add a Canonical tag to the post, with a link back to your original post.

Republishing your guest posts on your own site:

Sometimes you’ll want to republish a guest post in part or in full on your own site. I’d prefer to let that guest post be unique to the host blog, or totally rewrite it for my own site.

If you really do want to repost it, then let Google know where the original was published and use a Canonical tag on your own version of the post.

Duplicate content from post scraping:

I see bloggers concerned about this all the time, and I get why. You spend ages writing an amazing blog post, only to see it reproduced without your permission on another site, images, links and all.

This is one of the annoying aspects of blogging. Usually, the larger the blog, the more it’s bound to get scraped.

But the good news is that content scraping won’t hurt your blog. It won’t help you either, but the main thing is it won’t affect your online efforts.

Nine times out of ten they will take your post exactly, word-for-word. If your internal linking is good, the links in the scraped article will all point back to your site. While these links won’t pass on any authority, you could see a trickle of traffic from the offending site.

Usually the site that has scraped and republished your content will be a small blog or website in the far-east with no rank or authority.

If you find that a scraper site has managed to outrank your own site with your post content, then let Google know about it! Use their scraper report tool to report the issue.

How to identify duplicate content

You may not know if you’ve got a duplicate content issue. The good news is there are many other methods and tools for finding and dealing with duplicate content. I’ve covered these in a post over on Peter Rhys Design, where I talk about these in detail.

For this post, I’ll mention my go-to-tool:
Siteliner: It’s great for quickly finding out any posts or pages on your blog with a high number of matching words to other existing posts. The free version allows the search of up to 250 pages, which is usually more than enough for the average blogger. www.siteliner.com

It’s also good for identifying posts with ‘thin content’ that you could beef up and republish.

So, you now know the truth about duplicate content and how to deal with it!

Do you have any duplicate content issues to iron out?
Let me know about it in the comments and how you plan on dealing with it:

6 Replies to “The Truth about Duplicate Content”

  1. All the things which are written in this post are extremely worthy after reading it.

    I would like to thanks Mr. Peter for writing such a Nice blog post so that everyone will learn with it.

Leave a Reply

Your email address will not be published. Required fields are marked *