SEO
Google On Percentage That Represents Duplicate Content

Google’s John Mueller recently answered a question of whether there’s a percentage threshold of content duplication that Google uses to identify and filter out duplicate content.
What Percentage Equals Duplicate Content?
The conversation actually started on Facebook when Duane Forrester (@DuaneForrester) asked if anyone knew if any search engine has published a percentage of content overlap at which content is considered duplicate.
Bill Hartzer (bhartzer) turned to Twitter to ask John Mueller and received a near immediate response.
Hey @johnmu is there a percentage that represents duplicate content?
For example, should we be trying to make sure pages are at least 72.6 percent unique than other pages on our site?
Does Google even measure it?
— Bill Hartzer (@bhartzer) September 23, 2022
Google’s John Mueller responded:
There is no number (also how do you measure it anyway?)
— 🌽〈link href=//johnmu.com rel=canonical 〉🌽 (@JohnMu) September 23, 2022
How Does Google Detect Duplicate Content?
Google’s methodology for detecting duplicate content has remained remarkably similar for many years.
Back in 2013, Matt Cutts (@mattcutts), a software engineer at the time at Google published an official Google video describing how Google detects duplicate content.
He started the video by stating that a great deal of Internet content is duplicate and that it’s a normal thing to happen.
“It’s important ot realize that if you look at content on the web, something like 25% or 30% of all the web’s content is duplicate content.
…People will quote a paragraph of a blog and then link to the blog, that sort of thing.”
He went on to say that because so much of duplicate content is innocent and without spammy intent that Google won’t penalize that content.
Penalizing webpages for having some duplicate content, he said, would have a negative effect on the quality of the search results.
What Google does when it finds duplicate content is:
“…try to group it all together and treat it as if it’s just one piece of content.”
Matt continued:
“It’s just treated as something that we need to cluster appropriately. And we need to make sure that it ranks correctly.”
He explained that Google then chooses which page to show in the search results and that it filters out the duplicate pages in order to improve the user experience.
How Google Handles Duplicate Content – 2020 Version
Fast forward to 2020 and Google published a Search Off the Record podcast episode where the same topic is described in remarkably similar language.
Here is the relevant section of that podcast from the 06:44 minutes into the episode:
“Gary Illyes: And now we ended up with the next step, which is actually canonicalization and dupe detection.
Martin Splitt: Isn’t that the same, dupe detection and canonicalization, kind of?
Gary Illyes: [00:06:56] Well, it’s not, right? Because first you have to detect the dupes, basically cluster them together, saying that all of these pages are dupes of each other,
and then you have to basically find a leader page for all of them.…And that is canonicalization.
So, you have the duplication, which is the whole term, but within that you have cluster building, like dupe cluster building, and canonicalization. “
Gary next explains in technical terms how exactly they do this. Basically, Google isn’t really looking at percentages exactly, but rather comparing checksums.
A checksum can be said to be a representation of content as a series of numbers or letters. So if the content is duplicate then the checksum number sequence will be similar.
This is how Gary explained it:
“So, for dupe detection what we do is, well, we try to detect dupes.
And how we do that is perhaps how most people at other search engines do it, which is, basically, reducing the content into a hash or checksum and then comparing the checksums.”
Gary said Google does it that way because it’s easier (and obviously accurate).
Google Detects Duplicate Content with Checksums
So when talking about duplicate content it’s probably not a matter of a threshold of percentage, where there’s a number at which content is said to be duplicate.
But rather, duplicate content is detected with a representation of the content in the form of a checksum and then those checksums are compared.
An additional takeaway is that there appears to be a distinction between when part of the content is duplicate and all of the content is duplicate.
Featured image by Shutterstock/Ezume Images
SEO
TikTok uppdaterade riktlinjer för communityn för att inkludera AI-innehåll

TikTok has updated its Community Guidelines, which will go into effect on April 21, 2023.
De updated guidelines introduce TikTok’s Community Principles, which guide content moderation to uphold human rights and international legal frameworks.
TikTok worked with over 100 organizations globally to strengthen its rules to address new threats and reduce potential user harm.
Key changes to Community Guidelines apply to synthetic media, tribes, and civic and election integrity.
AI-Generated Content
TikTok defines “synthetic media” as content created or modified by AI. While AI and related technologies allow creators to express themselves in many new ways, they can also blur the line between fact and fiction for viewers.
Creators must label synthetic or altered media as such to mitigate the potential risks of spreading misinformation.
To reduce potential harm, synthetic media featuring real private individuals is prohibited. Private individuals include anyone under 18 and adults who are not public figures. The use of public figures over 18 – government officials, politicians, business leaders, and celebrities – is permitted, but with restrictions.
Creators must not use synthetic media to violate policies against hate speech, sexual exploitation, and severe harassment. They must also clearly disclose synthetic media and manipulated content that depict realistic scenes with fake people, places, or events.
Public figures cannot be used in synthetic audio or video for political or commercial endorsements to mislead users about financial or political issues.
You can, however, use synthetic media in artistic and educational content.
Protection Of Tribes
TikTok policies already include rules meant to protect people and groups with specific attributes from hateful behavior, hate speech, and hateful ideologies.
With new guidelines, the platform added Tribes to the list of protected attributes, including ethnicity, gender, race, religion, and sexual orientation.
While TikTok allows critical content on public figures, as defined above, it prohibits language that harasses, humiliates, threatens, or doxxes everyone.
Users can consult resources and tools provided by TikTok to identify bullying behavior and configure their settings to prevent it from affecting them further.
Civil And Election Integrity
Noting that elections are essential to community dialogue and upholding societal values, TikTok recently emphasized its alleged efforts to encourage topical discussions while maintaining unity.
To achieve this goal, paid political promotion, advertising, and fundraising by politicians or parties are prohibited. This policy applies to traditional ads and compensated creator content.
TikTok claims to support informed civic idea exchanges to promote constructive conversations without allowing misinformation about voting processes and election outcomes. Content that includes unverified claims about election results will not be eligible to appear in the For You Feed.
Before these changes go into effect next month, moderators will receive additional training on enforcing them effectively.
Will Recent Changes Prevent More TikTok Bans?
TikTok’s refreshed Community Guidelines and explanation of Community Principles appear to attempt greater transparency and foster a safe, inclusive, and authentic environment for all users.
TikTok plans to continue investing in safety measures to encourage creativity and connection within its global community of one billion users globally.
TikTok’s latest changes to improve transparency, reduce harm, and provide higher-quality content for users may be part of efforts to prevent the app from being banned in the U.S.
This week, the House Energy and Commerce Committee will hold a full committee hearing with TikTok CEO Shou Chew on how congress can protect the data privacy of U.S. users and children from online harm.
Organizations like the Tech Oversight Project have also expressed concerns about risks that big tech companies like Amazon, Apple, Google, and Meta pose.
Featured Image: BigTunaOnline/Shutterstock
SEO
Google lanserar BARD AI Chatbot för att konkurrera med ChatGPT

Google has unveiled BARD, an AI chatbot designed to compete with OpenAI’s ChatGPT and Microsoft’s chatbot in their Bing search engine.
In a blog post, Google describes Bard as an early AI experiment to enhance productivity, accelerate ideas, and foster curiosity.
You can use BARD to get tips, explanations, or creative assistance in tasks such as outlining blog posts.
With BARD, Google aims to solidify its presence in the AI chatbot space while maintaining its dominance in the search engine market.
BARD’s Technical Details
BARD is powered by a research large language model (LLM) – a lightweight and optimized version of LaMDA.
It will be updated with more advanced models over time. As more people use LLMs, they become better at predicting helpful responses.
BARD is designed as a complementary experience to Google Search, allowing users to check its responses or explore sources across the web.
Operating as a standalone webpage, BARD consists of a singular question box instead of being integrated into Google’s search engine.
This strategic move is to adopt new AI technology while preserving the profitability of its search engine business.
Cautious Rollout Amid Unpredictability Concerns
Google’s cautious approach to BARD’s release is in response to the concerns over unpredictable and sometimes unreliable chatbot technology, as demonstrated by competitors.
Google recognizes LLMs can sometimes produce biased, misleading, or false information.
To mitigate these issues, Google allows you to choose from a few drafts of BARD’s response.
You can continue collaborating with BARD by asking follow-up questions or requesting alternative answers.

Google’s Race to Ship AI Products
Since OpenAI’s release of ChatGPT and Microsoft’s introduction of chatbot technology in Bing, Google has prioritized AI as its central focus.
The company’s internal teams, including AI safety researchers, are working collaboratively to accelerate approval for a range of new AI products.
Google’s work on BARD is guided by its AI Principles, focusing on quality and safety.
The company uses human feedback and evaluation to enhance its systems. It has implemented guardrails, such as capping the number of exchanges in a dialogue, to keep interactions helpful and on-topic.

In Development Since 2015
Google has been developing the technology behind BARD since 2015.
However, similar to OpenAI and Microsoft’s chatbots, BARD has not been released to a broader audience due to concerns about generating untrustworthy information and potential biases against certain groups.
Google acknowledges these issues and aims to bring BARD to market responsibly.
BARD Availability
You can sign up to try BARD at bard.google.com.
Access is initially rolling out in the US and UK, with plans to expand to more countries and languages over time. It’s possible to get around the limited rollout with a VPN.
Google requires users to have a Gmail address to sign up and doesn’t accept Google Workspace email accounts.
Sources: Google, The New York Times
Featured Image: Muhammad S0hail/Shutterstock
SEO
Kommer AI att döda SEO? Vi frågade ChatGPT

It happens every couple of years.
First, it was Jason Calacanis and Mahalo, then the early social platforms.
We saw it again with voice search and smart assistants. For a minute, it was TikTok’s turn. Then the metaverse jumped the line.
Now, it’s ChatGPT and AI.
I’m talking, of course, about “SEO killers.”
Every now and then, a new technology comes along, and three things inevitably happen:
- Thousands of SEO professionals publicera posts and case studies declaring themselves experts in the new thing.
- Every publication dusts off its “SEO is dead” article, changes the date, and does a find and replace for the new technology.
- SEO continues to be stronger than ever.
Rinse, repeat.
It would seem that search has more lives than a cartoon cat, but the simple truth is: Search is immortal.
How we search, what devices we use, and whether the answer is a link to a website will forever be up for debate.
But as long as users have tasks to complete, they’ll turn somewhere for help, and digital marketers will influence the process.
Will AI Replace Search?
There’s a ton of hype right now about AI replacing both search engines and search professionals – I don’t see that happening. I view ChatGPT as just another tool.
Much like a knife: You can butter bread or cut yourself. It’s all in how you use it.
Will AI replace search engines? Let’s ask it ourselves!
That’s a pretty good answer.
Many SEO professionals (including me) have been saying for years that the days of tricking the algorithm are long gone.
SEO has been slowly morphing into digital marketing for a long time now. It’s no longer possible to do SEO without considering user intent, personas, use cases, competitive research, market conditions, etc.
Ok, but won’t AI just do that for us? Is AI going to take my job? Here’s a crazy idea: Let’s ask ChatGPT!

AI Isn’t Going To Take Your Job. But An SEO Who Knows How To Use AI To Be More Efficient Just Might
Why? Let’s dive in.
I still see a lot of SEO pros writing articles that ask AI to do things it’s simply incapable of – and this comes from a basic understanding of how large language models actually work.
AI tools, like ChatGPT, aren’t pulling any information from a database of facts. They don’t have an index or a knowledge graph.
They don’t “store” information the way a search engine does. They’re simply predicting what words or sentences will come next based on the material they’ve been trained on. They don’t store this training material, though.
They’re using word vectors to determine what words are most likely to come next. That’s why they can be so good och also hallucinate.
AI can’t crawl the internet. It has no knowledge of current events and can’t cite sources because it doesn’t know or retain that information. Sure, you can ask it to cite sources, but it’s really just making stuff up.
For really popular topics that were discussed a lot, it can get pretty close – because the probabilities of those words coming next are really high – but the more specific you get, the more it will hallucinate.
Given the extreme amount of time and resources it takes to train the model, it will be a long time before AI can answer any queries about current events.
But What About Bing, You.com, And Google’s Upcoming Bard? They Can Do All Of This, Can’t They?
Yes and no. They can cite sources, but that’s based on how they’re implementing it. To vastly oversimplify, Bing isn’t asking for a pure chatbot.
Bing is searching for your query/keyword. It’s then feeding in all the webpages that it would normally return for that search and asking the AI to summarize those webpages.
You and I can’t do that on the public-facing AI tools without hitting token limits, but search engines can!
Ok, Surely This Will Kill SEO. AI Will Just Answer Every Question, Right?
I disagree.
All the way back in 2009 (when we were listening to the Black Eyed Peas on our iPhone 3Gs and updating our MySpace top 8 on Windows Vista), a search engine once called Live was being renamed to Bing.
Why? Because Bing is a verb. This prompted Bill Gates to declare, “The future of search is verbs.”
I love to share this quote with clients every chance I get because that future is now.
Gates wasn’t talking about people typing action words into search engines. He meant that people are trying to “do” something, and the job of search is to help facilitate that.
People often forget that search is a form of pull marketing, where users tell us what they want – not push marketing like a billboard or a TV ad.
As digital marketers, our job is simple: Give users what they want.
This is where the confusion comes in, though.
For many queries that have simple answers, a link to a website with a popup cookie policy, notification alert, newsletter sign-up popup, and ads were never what the user wanted.
It’s just the best thing we had back then. Search engines never set out with the end goal of providing links to websites. They set out to answer questions and help users accomplish tasks.
Even from the earliest days, Google talked about how its goal was to be the Star Trek computer; it just didn’t have the technology to do it then. Now, it does.
For many of these queries, like [how old is Taylor Swift?] or [how many megabytes in a gigabyte?], websites will lose traffic – but it’s traffic they were probably never entitled to.
Who owns that answer anyway? These are questions with simple answers. The user’s task is simply to get a number. They don’t want a website.
Smart SEO pros will focus on the type of queries where a user wants to do something – like buy Taylor Swift tickets, get reviews of her album or concerts, chat with other Swifties, etc. That’s where AI won’t be able to kill SEO or search.
What ChatGPT Can Do Vs. What It Can’t
ChatGPT can accomplish a lot of things.
It’s good at showing me how to write an Excel formula or MySQL query, but it will never teach me MySQL, sell me a course, or let me talk with other developers about database theory.
Those are things a search engine can help me do.
ChatGPT can also help answer many “common knowledge” questions, as long as the topic isn’t contested and is old and popular enough to have shown up in the training data.
Even then, it’s still not 100% accurate – as we’ve seen in countless memes and with one famous bank being called out for its AI-written article not knowing how to calculate interest properly.
AI might list the most talked about bars in NYC, but it can’t recommend the best place to get an Old Fashioned like a human can.
Honestly, all SEO pros talking about using AI to create content are starting to bore me. Answering questions is neat, but where ChatGPT really excels is in text manipulation.
At my agency, we’re already using ChatGPT’s API as an SEO tool to help create content briefs, categorize and cluster keywords, write complicated regular expressions for redirects, and even generate XML or JSON-LD code based on given inputs.
These rely on tons of inputs from various sources and require lots of manual reviews.
We’re not using it to create content, though. We’re using it to summarize and examine other pieces of content and then use those to glean insights. It’s less of an SEO replacement and more of a time saver.
SEO Is Here To Stay
What if your business is built around displaying facts you don’t really “own”? If so, you should probably be worried – not just about AI.
Boilerplate copy tasks may be handled by AI. Recent tests I’ve done on personal sites have shown some success here.
But AI will never be capable of coming up with insights or creating new ideas, staying on top of the latest trends, or providing the experience, expertise, authority, or trust that a real author can.
Remember: It’s not thinking, citing, or even pulling data from a database. It’s just looking at the next-word probabilities.
Unlike thousands of SEO pros who recently updated their Twitter bios, I may not be an expert on AI, but I have a computer science degree. I also know what it takes to understand user needs.
So far, no data shows people would prefer auto-generated, re-worded content over unique curated content written by a real human being.
People want fresh ideas and insights that only people can provide. (If we add an I to ÄTA, where should it go?)
If your business or content delivers value through insights, curation, current trends, recommendations, solving problems, or performing an action, then SEO and search engines aren’t going anywhere.
They may change shape from time to time, but that just means job security for me – and I’m good with that.
Fler resurser:
Featured Image: Elnur/Shutterstock
-
SÖKMOTORER6 dagar ago
Google säger Ignorera spamhänvisningstrafik
-
PPC6 dagar ago
Hur du hittar online: Våra 9 bästa tips för lokala serviceföretag
-
MARKNADSFÖRING5 dagar ago
12 bästa metoder för att öka din TikTok-annonsprestanda
-
AMAZON1 dag ago
De 10 bästa fördelarna med Amazon AWS Lightsail: Varför det är ett utmärkt val för företag
-
MARKNADSFÖRING6 dagar ago
Topp 10 verktyg för innehållsmarknadsföring som förbättrar marknadsföringsteamets produktivitet
-
MARKNADSFÖRING6 dagar ago
En avhandling om e-handelsdatasäkerhet och efterlevnad
-
SÖKMOTORER6 dagar ago
John Mueller erbjuder Hreflang Google SEO-råd
-
SEO7 dagar ago
SEO för flera platser (nybörjarguide)