Cloning content is for dummiesDuplicate content. Every time someone mentions those words to me I shiver as I wonder what’s going to come next.

Usually it’s a tale about how everything on the page should be different to anything else, all content should be put through CopyScape to check it and if any words even appear to be in the same order as the same words on another page then you’ll probably be hit with a Google penalty.

First then, let’s look at the history of duplicate content and why it’s even a “thing” that people talk about.

A history of bad SEO

Way back in the history of Internet market, say 2010 or so, it was the coming of the “affiliate marketer” which was a strange beast of a person who made all his money simply re-selling things like weight loss pills to people who needed them. A good research tactic back then was to look to see who was ranking well for the term “weight loss pills” and then copy their content pretty much wholesale onto a new site.

Google was still finding its feet and on-site content was very important to ranking but they really didn’t have the brains to work out what was good content or what was bad content yet. Content on your site and the number of links pointing to it were the big ranking factors. So the hapless marketing guru copied most of the content off a site and then created thousands upon thousands of links and within a few weeks, it was in the top ten.

And that’s the truth. Back then it really was possible to rank that easily and so people did it in their droves, and not just for weight loss, they did it for all kinds of products and it always worked, before Google did something about it and the mystical and mythical “duplicate content penalty” arrived.

Indeed in the B2B technology arena sites such as electronicstalk and engineeringtalk vanished from the Google having previously been in the top few positions. Even way back then they knew to start searching for unique content soon afterwards.

Is it a penalty?

Let’s just get a few things straight. There’s no specific duplicate content penalty as such. You see, it comes under the umbrella of “giving a good user experience” that Google (and indeed other search engines) go on about quite a lot. What they’re looking at doing is making sure that people get good, relevant and timely information and if you do a search for “best ways to exercise”, it’s not really of any value to see a top ten full of exactly the same article – and this is what used to happen. It was perfectly possible to search for something and see ten repeated articles, all exactly the same. In essence, duplicate content.

So Google decided that it would only show one of them in most cases and it did this by filtering out those that it deemed had simply copied the content. Obviously it had to decide which one was the best, so it based this on the number and quality of links together with a lot of other factors.

Many webmasters then saw their pages disappear and they cried “duplicate content penalty” from the rooftops.

This is not a penalty, it’s merely Google filtering results to give a better result to people searching. Of course, if your site has suddenly disappeared from the rankings then it seems like a penalty, but really it’s not.

Could I get a penalty?

Google does, of course, dish out penalties with impunity. They are judge jury and executioner on all of our websites and if they believe you’ve been breaking their rules then they reserve the right to simply slap you out of their index immediately. This is called a “manual penalty” and as the name suggests, a human does this manually, it’s not done by the algorithm.

If one of the employees at Google sees your site and decides that you’ve been doing bad things and copying content, then they might decide to give you a manual penalty. Yes, that can be a penalty in part due to your duplicate content, but no, that’s not a duplicate content penalty either. It’s a “bad practices” penalty and you should know better.

Google is pretty clear on this:

“In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we’ll also make appropriate adjustments in the indexing and ranking of the sites involved.”

That’s “rare” instances.

They also say:

“Google has said time and time again, duplicate content issues are rarely a penalty. It is more about Google knowing which page they should rank and which page they should not. Google doesn’t want to show the same content to searchers for the same query; they do like to diversify the results to their searchers.”

When could this happen to me?

Google is only looking to penalise those who would try to fool it and get artificially high rankings for stolen content, therefore it looks for sites that:

  • Has nothing but scraped content
  • Scrapes images, auto-translates pages, or uses automated tools to spin content prior to publication
  • Purposefully creates pages with nearly identical content to rank them for various locations/keywords

Essentially, if you try to create a site with the obvious intention of gaining rankings then Google reserves the right to penalise it.

What’s best practice then?

Duplicate content isn’t great, we know that, however having a few paragraphs of information that’s the same as someone else’s isn’t going to get your website thrown into a pit. That being said, unique content is ideal and should be encouraged. If you have a website full of unique content that will stun and amaze those who read it, you’re on to a winner – but don’t be afraid to quote other people’s work if it’s in context. Google is trying its best to bring the best results to the fore and so that’s why sites that do nothing but copy other people’s work won’t get full prominence. However, if you are not trying to deceive users and you’re not trying to fool Google, you should generally be OK.

What about products?

Good question. One of the areas where duplicate content crops up a lot is with products on e-commerce sites. You’ve probably seen some sites that have the same product in multiple categories. For example, is a printer part of the “electronics” department or is it “office supplies”?

If it’s in both then the exact same page with the exact same content could appear like this:

Different URL, same item. In fact many stores have this happen lots of times so products are listed on five or more URLs. Now, this is a special case and again, in the past you might have had multiple listings in the top ten for exactly the same product. Not any more.

Google will display the one it thinks is the most relevant but if you like, you can tell it which one to show in preference. You do this by using the “rel=canonical” tag.

I won’t go into the specifics here, but what happens is you simply decide which is the “master” category if you like, for example you might decide that it should be in “electronics” and then in every other occurrence of the product have the following in the header:

<link rel="canonical" href="" />

So now Google knows which page to display.

Will I get a penalty if I don’t do this?

Unless you’ve got hundreds of examples and it’s quite clear you have a) a really badly configured shopping system or b) you’re going out of your way to annoy Google then no, but you are also leaving it up to Google to decide which is the one to display so it’s best to use this function.

In summary…

Duplicate content is only an issue if you’re going out of your way to try to fool Google. If you’re copying other people’s work wholesale and trying to gain rankings simply by using this tactic then it’s unlikely your pages will appear in search, so it’s pointless. If you go out of your way to create a spam site then there’s a chance Google will slap a manual penalty on your site, but this is rare.

In most cases you’re going to be OK, but if you end up with a penalty and it’s to do with duplicate content, you’ve probably been very naughty indeed!