GOOGLE

Google: Sometimes Copied Content is More Relevant

Published

4 years ago

December 14, 2020

Google’s John Mueller answered a question about how Google handles copied content. Mueller’s answer covered different kinds of duplicate content. He ended his response with an observation about the original source of content and relevance.

How Does Google Determine Original Source of Content?

The person asking the question wanted to know how Google determined which content is the original source and which content copied.

The question:

“How Google determines if a particular… content is copied or not and who is the original source for that content?”

John Mueller gave a broad answer that used different kinds of copied content as examples (like boilerplate content used across an entire website).

This approach had the effect of giving a rounded overview of the different kinds of copied content that Google has to deal with.

Screenshot of Google’s John Mueller Discussing Copied Content

Screenshot of Google's John Mueller

Google’s Mueller explaining how Google handles copied content and why the copy can outrank the original contentMueller answered:

“I think it’s kind of tricky in some aspects… in some aspects it’s really easy because if you take a piece of text and you search for it then it’s exactly the same text that is on the web or on other pages then that’s a pretty good sign that this is copied content.
So for example if you have copied content that is more along the lines of boilerplate text like, you have… legal disclaimer on the bottom of your site. Which is something that you have across all of your pages of your site.
Then technically, that’s copied content.
But practically for us, that’s not really an issue because these are things that people are generally not searching for. It’s not that they’re searching for the legal disclaimer and they want to find your site.
It’s more they’re looking for your primary content. And… in that regard it’s something where we try to weigh the copied content appropriately but also kind of like still look at the rest of your site.
It’s easy to recognize that there’s copied content on these pages but it’s hard to figure out what we should do about that copied content.”

Determining Ownership of Content is Difficult

In the following passage the audio is garbled, that’s why some of the quote is in parentheses, as that’s my estimation of what Mueller said.

John Mueller addressed authorship of content, particularly in the difficulty of identifying who is the primary author.

Mueller said:

“(With regard to the) author or owner of that content, I don’t think we go and make any judgement in that regard because that’s really tricky like… we can’t determine who is the owner.”

Why Copied Content Can Be More Relevant Than the Original

Mueller ended by using Google’s own blog posts as an example of Google’s algorithm ranking other sites ahead of Google’s own content. He said it was about relevance.

“And sometimes the person who wrote it first is not the one for example that is the most relevant.
So we see this a lot of times for example with our own blog posts where we will write a blog post and we’ll put the information we want to share on our blog post and someone will copy that content and they will add a lot of extra information around it.
It’s like, here’s what Google really wants to tell you and it’s like reading between the lines and the secret to Google’s algorithms.
And when someone is searching it’s like maybe they want to find the original source. Maybe they want to find this more elaborate… exploration of the content itself.
Advertisement

So, just because something is original doesn’t mean that it’s the one that is the most relevant when someone is looking for that information.”

Originality Doesn’t Always Mean Relevant

Quoting something and offering commentary on the quote is generally regarded as fair use. Mueller demonstrated that what counts more about content is how it relates to a user query.

Sometimes that can mean making sure that the content answers the why, how, and what type questions that are inherent in some search queries.