Connect with us

SEO

You Can’t Compare Backlink Counts in SEO Tools: Here’s Why

Published

on

You Can't Compare Backlink Counts in SEO Tools: Here's Why

Google knows about 300T pages on the web. It’s doubtful they crawl all of those, and at least according to some documents from their antitrust trial we learned they only indexed 400B. That’s around .133% of the pages they know about, roughly 1 out of every 752 pages.

For Ahrefs, we choose to store about 340B pages in our index as of December 2023.

At a certain point, the quality of the web becomes bad. There are lots of spam and junk pages that just add noise to the data without adding any value to the index.

Large parts of the web are also duplicate content, ~60% according to Google’s Gary Illyes. Most of this is technical duplication caused by different systems. However, if you don’t account for this duplication, it can waste more resources and create more noise in the data.

When building an index of the web, companies have to make many choices around crawling, parsing, and indexing data. While there’s going to be a lot of overlap between indexes, there’s also going to be some differences depending on each company’s decisions.

Comparing link indexes is hard because of all the different choices the various tools have made. I try my best to make some comparisons more fair, but even for a few sites I’m telling you that I don’t want to put in all of the work needed to make an accurate comparison, much less do it for an entire study. You’ll see why I say this later when you read what it would take to compare the data accurately.

However, I did run some tests on a sample of sites and I’ll show you how to check the data yourself. I also pulled some fairly large 3rd party data samples for some additional validation.

Let’s dive in.

If you just looked at dashboard numbers for links and RDs in different tools you might see completely different things.

For example, here’s what we count in Ahrefs:

  • Live links
  • Live RDs
  • 6 months of data

In Semrush, here’s what they count:

  • Live + dead links
  • Live + dead RDs
  • 6 months of data + a bit more*

*By a bit more, what I mean is that their data goes back 6 months and to the start of the previous month. So, for instance, if it’s the 15th of the month, they would actually have about 6.5 months of data instead of 6 months of data. If it’s the last week of the month, they may have close to 7 months of data instead of 6.

This may not seem like a lot, but it can increase the numbers shown by a lot, especially when you’re still counting dead links and dead RDs.

I don’t think SEOs want to see a number that includes dead links. I don’t see a good reason to count them, either, other than to have bigger and potentially misleading numbers.

I only say this because I’ve called Semrush out on making this type of biased comparison before on Twitter, but I stopped arguing when I realized that they really didn’t want the comparison to be fair; they just wanted to win the comparison.

There are some ways you can compare the data to get somewhat similar time periods and only look at active links.

If you filter the Semrush backlinks report for “Active” links, you’ll have a somewhat more accurate number to compare against the Ahrefs dashboard number.

Alternatively, if you use the “Show history: Last 6 months” option in the Ahrefs backlink report, this would include lost links and be a fairer comparison to Semrush’s dashboard number.

Here’s an example of how to get more similar data:

  • Semrush Dashboard: 5.1K = Ahrefs (6-month date comparison): 5.6K
  • Semrush All Links: 5.1K = Ahrefs (6-month date comparison): 5.6K
  • Semrush Active Links: 2.9K = Ahrefs Dashboard: 3.5K = Ahrefs (no date comparison): 3.5K

What you should not compare is Semrush Dashboard and Ahrefs Dashboard numbers. The number in Semrush (5.1K) includes dead links. The number in Ahrefs (3.5K) doesn’t; it’s only live links!

Note that the time periods may not be exactly the same as mentioned before because of the extra days in the Semrush data. You could look at what day their data stops and select that exact day in the Ahrefs data to get an even more accurate, but still not quite accurate comparison.

I don’t think the comparison works at all with larger domains because of an issue in Semrush. Here’s what I saw for semrush.com:

  • Semrush Dashboard: 48.7M = Ahrefs (6 month date comparison): 24.7M
  • Semrush All Links: 48.7M = Ahrefs (6 month date comparison): 24.7M
  • Semrush Active Links: 1.8M = Ahrefs Dashboard: 15.9M = Ahrefs (no date comparison): 15.9M

So that’s 1.8M active links in Semrush vs 15.9M active in Ahrefs. But as I said, I don’t think this is a fair comparison. Semrush seems to have an issue with larger sites. There is a warning in Semrush that says, “Due to the size of the analyzed domain, only the most relevant links will be shown.” It’s possible they’re not showing all the links, but this is suspicious because they will show the total for all links which is a larger number, and I can filter those in other ways.

I can also sort normally by the oldest last seen date and see all the links, but when I do last seen + active, I see only 608K links. I can’t get more than 50k rows in their system to investigate this further, but something is fishy here.

More link differences

The above comparison wouldn’t be enough to make an accurate comparison. There are still a number of differences and problems that make any sort of comparison troublesome.

This tweet is as relevant as the day I wrote it:

It’s almost impossible to do a fair link comparison

Here’s how we count links, but it’s worth mentioning that each tool counts links in different ways.

To recap some of the main points, here are some things we do:

  • We store some links inserted with JavaScript, no one else does this. We render ~250M pages a day.
  • We have a canonicalization system in place that others may not, which means we shouldn’t count as many duplicates as others do.
  • Our crawler tries to be intelligent about what to prioritize for crawling to avoid spam and things like infinite crawl paths.
  • We count one link per page, others may count multiple links per page.

These differences make a fair link comparison nearly impossible to do.

How to see where the biggest link differences are

The easiest way to see the biggest discrepancies in link totals is to go to the Referring Domains reports in the tools and sort by the number of links. You can use the dropdowns to see what kinds of issues each index may have with overcounting some links. In many cases, you’re likely to see millions of links from the same site for some of the reasons mentioned above.

For example, when I looked in Semrush I found blogspot links that they claimed to have recently checked, but these are showing 404 when I visit them. Semrush still counts them for some reason. I saw this issue on multiple domains I checked. This is one of those pages:

Lots of links counted as live are actually dead

Seeing the dead link above counted in the total made me want to check how many dead links were in each index. I ran crawls on the list of the most recent live links in each tool to see how many were actually still live.

For Semrush, 49.6% of the links they said were live were actually dead. Some churn is expected as the web changes, but half the links in 6 months indicates that a lot of these may be on the spammier part of the web that isn’t as stable or they’re not re-crawling the links often. For some context, the same number for Ahrefs came back as 17.2% dead.

It’s going to get more complicated to compare these numbers

Ahrefs recently added a filter for “Best links” which you can configure to filter out noise. For instance, if you want to remove all blogspot.com blogs from the report, you can add a filter for it.

Ahrefs' Best links filterAhrefs' Best links filter

This means you’ll only see links you consider important in the reports. This can also be applied to the main dashboard numbers and charts now. If the filter is active, people will see different numbers depending on their settings.

You would think this is straightforward, but it’s not.

Solving for all the issues is a lot of work

There are a lot of different things you’d have to solve for here:

  • The extra days in Semrush’s data that you’ll have to remove or add to the Ahrefs number.
  • Remember that Semrush also includes dead RDs in their dashboard numbers. So you need to filter their RD report to just “Active” to get the live ones.
  • Remember that half the links in the test of Semrush live data were actually dead, so I would suspect that a number of the RDs are actually lost as well. You could possibly look for domains with low link counts and just crawl the listed links from those to remove most of the dead ones.
  • After all that, you’re still going to need to strip the domains down to the root domain only to account for the differences in what each tool may be counting as a domain.

What is a domain?

Ahrefs currently shows 206.3M RDs in our database and Semrush shows 1.6B. Domains are being counted in extremely different ways between the tools.

Ahrefs has 340B pages and 206M domains in the indexAhrefs has 340B pages and 206M domains in the index

According to the major sources who look at these kinds of things, the number of domains on the internet seems to be between 269M359M and the number of websites between 1.1B1.5B, with 191M200M of them being active.

Semrush’s number of RDs is higher than the number of domains that exist.

I believe Semrush may be confusing different terms. Their numbers match fairly closely with the number of websites on the internet, but that’s not the same as the number of domains. Plus, many of those websites aren’t even live.

It’s going to get more complicated to compare these numbers

Part of our process is dropping spam domains, and we also treat some subdomains as different domains. We come up close to the numbers from other 3rd party studies for the number of active websites and domains, whereas Semrush seems to come in closer to the total number of websites (including inactive ones).

We’re going to simplify our methodology soon so that one domain is actually just one domain. This is going to make our RD numbers go down, but be more accurate to what people actually consider a domain. It’s also going to make for an even bigger disparity in the numbers between the tools.

I ran some quality checks for both the first-seen and last-seen link data. On every site I checked, Ahrefs picked up more links first and on most Ahrefs updated the links more recently than Semrush. Don’t just believe me, though; check for yourself.

Comparing this is biased no matter how you look at it because our data is more granular and includes the hours and minutes instead of just the day. Leaving the hours and minutes creates a biased comparison, and so does removing it. You’ll have to match the URLs and check which date is first or if there is a tie and then count the totals. There will be some different links in each dataset, so you’ll need to do the lookups on each set of data for comparison.

Semrush claims, “We update the backlinks data in the interface every 15 minutes.”

Ahrefs claims, “The world’s largest index of live backlinks, updated with fresh data every 15–30 minutes.”

I pulled data at the same time from both tools to see when the latest links for some popular websites were found. Here’s a summary table:

Domain Ahrefs Latest Semrush latest
semrush.com 3 minutes ago 7 days ago
ahrefs.com 2 minutes ago 5 days ago
hubspot.com 0 minutes ago 9 days ago
foxnews.com 1 minute ago 12 days ago
cnn.com 0 minutes ago 13 days ago
amazon.com 0 minutes ago 6 days ago

That doesn’t seem fresh at all. Their 15-minute update claim seems pretty dubious to me with so many websites not having updates for many days.

In fairness, for some smaller sites it was more mixed on who showed fresher data. I think they may have some issues with the processing of larger sites.

Don’t just trust me, though; I encourage you to check some websites yourself. Go into the backlinks reports in both tools and sort by last seen. Be sure to share your results on social media.

Ahrefs crawls 7B+ pages every day. Semrush claims they crawl 25B pages per day. This would be ~3.5x what Ahrefs crawls per day. The problem is that I can’t find any evidence that they crawl that fast.

We saw that around half the links that Semrush had marked as active were actually dead compared to about 17% in Ahrefs, which indicated to me that they may not re-crawl links as often. That and the freshness test both pointed to them crawling slower. I decided to look into it.

Logs of my sites

I checked the logs of some of my sites and sites I have access to, and I didn’t see anything to support the claim that Semrush crawls faster. If you have access to logs of your own site, you should be able to check which bots are crawling the fastest.

80,000 months of log data

I was curious and wanted to look at bigger samples. I used Web Explorer and a few different footprints (patterns) to find log file summaries produced by AWStats and Webalizer. These are often published on the web.

Web Explorer search I used to find log files on the webWeb Explorer search I used to find log files on the web

I scraped and parsed ~80,000 log file summaries that contained 1 month of data each and were generated in the last couple of years. This sample contained over 9k websites in total.

I did not see evidence of Semrush crawling many times faster than Ahrefs for these sites, as they claim they do. The only bot that was crawling much faster than Ahrefsbot in this dataset was Googlebot. Even other search engines were behind our crawl rate.

That’s just data from a small-ish number of sites compared to the scale of the web. What about for a larger chunk of the web?

Data from 20%+ of web traffic

At the time of writing, Cloudflare Radar has Ahrefsbot as the #7 most active bot on the web and Semrushbot at #40.

While this isn’t a complete picture of the web, it’s a fairly large chunk. In 2021, Cloudflare was said to manage ~20% of the web’s traffic, up from ~10% in 2018. It’s likely much higher now with that kind of growth. I couldn’t find the numbers from 2021, but in early 2022 they were handling 32 million HTTP requests / second on average and in early 2023 they had already grown to handling 45 million HTTP requests / second on average, over 40% more in one year!

Additionally, ~80% of websites that use a CDN use Cloudflare. They handle many of the larger sites on the web; BuiltWith shows that Cloudflare is used by ~32% of the Top 1M websites. That’s a significant sample size and likely the largest sample that exists.

How much do SEO tools crawl?

Some of the SEO tools share the number of pages they crawl on their websites. The only one in the chart below that doesn’t have a publicly published crawl rate is AhrefsSiteAudit bot, but I asked our team to pull the info for this. Let me put the rankings in perspective with actual and claimed crawl rates.

Ranking Bot Crawl Rate
7 Ahrefsbot 7B+ / day
27 DataForSEO Bot 2B / day
29 AhrefsSiteAudit 600M – 700M / day
35 Botify 143.3M / day
40 Semrushbot 25B / day* claimed

The math isn’t mathing. How can Semrush claim they’re crawling multiple times as fast as these others, but their ranking is lower? Cloudflare doesn’t cover the entire web, but it’s a large chunk of the web and a more than representative sample size.

When they originally made this 25B claim, I believe they were closer to 90th on Cloudflare Radar, near the bottom of the list at the time. Semrush hasn’t updated this number since then, and I recall a period of time where they were in the 60s-70s on Cloudflare Radar as well. They do seem to be getting faster, but their claimed numbers still don’t add up.

I don’t hear SEOs raving about Moz or Sistrix having the best link data, but they are 21st and 36th on the list respectively. Both are higher than Semrush.

Possible explanations of differences

Semrush may be conflating the term pages with links, which is actually mentioned in some of their documentation. I don’t want to link to it, but you can find it with this quote: “Daily, our bot crawls over 25 billion links”. But links are not the same thing as pages and there can be hundreds of links on a single page.

It’s also possible they’re crawling a portion of the web that’s just more spammy and isn’t reflected in the data from either of the sources I looked at. Some of the numbers indicate this may be the case.

Y’all shouldn’t trust studies done by a specific vendor when it compares them to others, even this one. I try to be as fair as I can be and follow the data, but since I work at Ahrefs you can hardly consider me unbiased. Go look at the data yourselves and run your own tests.

There are some folks in the SEO community who try to do these tests every once in a while. The last major 3rd party study was run by Matthew Woodward, who initially declared Semrush the winner, but the conclusion was changed and Ahrefs was ultimately declared to be the rightful winner. What happened?

The methodology chosen for the study heavily favored Semrush and was investigated by a friend of mine, Russ Jones, may he rest in peace. Here’s what Russ had to say about it:

While services like Majestic and Ahrefs likely store a single canonical IP address per domain, SEMRush seems to store per link, which accounts for why there would be more IPs that referring domains in some cases. I do not think SEMRush is intentionally inflating their numbers, I think they are storing the data in a different way than competitors which results in a number that is higher and potentially misleading, but not due to ill intent.

The response from Matthew indicated that Semrush might have misled him in their favor. Here’s that comment:

Comment from Matthew Woodward in response to Semrush about the test.Comment from Matthew Woodward in response to Semrush about the test.

In the end, Ahrefs won.

Check our current stats on our big data page.

Hardware listed on the Ahrefs big data pageHardware listed on the Ahrefs big data page

While Semrush doesn’t provide current hardware stats, they did provide some in the past when they made changes to their link index.

In June 2019, they made an announcement that claimed they had the biggest index. The test from Matthew Woodward that I talked about happened after this test, and as you saw, Ahrefs won that.

In June 2021, they made another announcement about their link index that claimed they were the biggest, fastest, and best.

These are some stats they released at the time:

  • 500 servers
  • 16,128 cpu cores
  • 245 TB of memory
  • 13.9 PB of storage
  • 25B+ pages / day
  • 43.8T links

The release said they increased storage, but their previous release said they had 4000 PBs of storage. They said the storage was 4x, so I guess the previous number was supposed to be 4000 TBs and not 4000 PBs, and they just got mixed up on the terminology.

I checked our numbers at the time, and this is how we matched up:

  • 2400 servers (~5x greater)
  • 200,000 cpu cores (~12.5x greater)
  • 900 TB of memory (~4x greater)
  • 120 PB of storage (~9x greater)
  • 7B pages / day (~3.5x less???)
  • 2.8T live links (I’m not sure the total size, but to this day it’s not as big as the number they claimed)

They were claiming more links and faster crawling with much less storage and hardware. Granted, we don’t know the details of the hardware, but we don’t run on dated tech.

They claimed to store more links than we have even now and in less space than we add to our system each month. It really doesn’t make sense.

Final thoughts

Don’t blindly trust the numbers on the dashboards or the general numbers because they may represent completely different things. While there’s no perfect way to compare the data between different tools, you can run many of the checks I showed to try to compare similar things and clean up the data. If something looks off, ask the tool vendors for an explanation.

If there ever comes a time when we stop winning on things like tech and crawl speed, go ahead and switch to another tool and stop paying us. But until that time, I’d be highly skeptical of any claims by other tools.

If you have questions, message me on X.



Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address

SEO

Google Search Leak: Conflicting Signals, Unanswered Questions

Published

on

By

Google Search Leak: Conflicting Signals, Unanswered Questions

An apparent leak of Google Search API documentation has sparked intense debate within the SEO community, with some claiming it proves Google’s dishonesty and others urging caution in interpreting the information.

As the industry grapples with the allegations, a balanced examination of Google’s statements and the perspectives of SEO experts is crucial to understanding the whole picture.

Leaked Documents Vs. Google’s Public Statements

Over the years, Google has consistently maintained that specific ranking signals, such as click data and user engagement metrics, aren’t used directly in its search algorithms.

In public statements and interviews, Google representatives have emphasized the importance of relevance, quality, and user experience while denying the use of specific metrics like click-through rates or bounce rates as ranking-related factors.

However, the leaked API documentation appears to contradict these statements.

It contains references to features like “goodClicks,” “badClicks,” “lastLongestClicks,” impressions, and unicorn clicks, tied to systems called Navboost and Glue, which Google VP Pandu Nayak confirmed in DOJ testimony are parts of Google’s ranking systems.

The documentation also alleges that Google calculates several metrics using Chrome browser data on individual pages and entire domains, suggesting the full clickstream of Chrome users is being leveraged to influence search rankings.

This contradicts past Google statements that Chrome data isn’t used for organic searches.

The Leak’s Origins & Authenticity

Erfan Azimi, CEO of digital marketing agency EA Eagle Digital, alleges he obtained the documents and shared them with Rand Fishkin and Mike King.

Azimi claims to have spoken with ex-Google Search employees who confirmed the authenticity of the information but declined to go on record due to the situation’s sensitivity.

While the leak’s origins remain somewhat ambiguous, several ex-Googlers who reviewed the documents have stated they appear legitimate.

Fishkin states:

“A critical next step in the process was verifying the authenticity of the API Content Warehouse documents. So, I reached out to some ex-Googler friends, shared the leaked docs, and asked for their thoughts.”

Three ex-Googlers responded, with one stating, “It has all the hallmarks of an internal Google API.”

However, without direct confirmation from Google, the authenticity of the leaked information is still debatable. Google has not yet publicly commented on the leak.

It’s important to note that, according to Fishkin’s article, none of the ex-Googlers confirmed that the leaked data was from Google Search. Only that it appears to have originated from within Google.

Industry Perspectives & Analysis

Many in the SEO community have long suspected that Google’s public statements don’t tell the whole story. The leaked API documentation has only fueled these suspicions.

Fishkin and King argue that if the information is accurate, it could have significant implications for SEO strategies and website search optimization.

Key takeaways from their analysis include:

  • Navboost and the use of clicks, CTR, long vs. Short clicks, and user data from Chrome appear to be among Google’s most powerful ranking signals.
  • Google employs safelists for sensitive topics like COVID-19, elections, and travel to control what sites appear.
  • Google uses Quality Rater feedback and ratings in its ranking systems, not just as a training set.
  • Click data influences how Google weights links for ranking purposes.
  • Classic ranking factors like PageRank and anchor text are losing influence compared to more user-centric signals.
  • Building a brand and generating search demand is more critical than ever for SEO success.

However, just because something is mentioned in API documentation doesn’t mean it’s being used to rank search results.

Other industry experts urge caution when interpreting the leaked documents.

They point out that Google may use the information for testing purposes or apply it only to specific search verticals rather than use it as active ranking signals.

There are also open questions about how much weight these signals carry compared to other ranking factors. The leak doesn’t provide the full context or algorithm details.

Unanswered Questions & Future Implications

As the SEO community continues to analyze the leaked documents, many questions still need to be answered.

Without official confirmation from Google, the authenticity and context of the information are still a matter of debate.

Key open questions include:

  • How much of this documented data is actively used to rank search results?
  • What is the relative weighting and importance of these signals compared to other ranking factors?
  • How have Google’s systems and use of this data evolved?
  • Will Google change its public messaging and be more transparent about using behavioral data?

As the debate surrounding the leak continues, it’s wise to approach the information with a balanced, objective mindset.

Unquestioningly accepting the leak as gospel truth or completely dismissing it are both shortsighted reactions. The reality likely lies somewhere in between.

Potential Implications For SEO Strategies and Website Optimization

It would be highly inadvisable to act on information shared from this supposed ‘leak’ without confirming whether it’s an actual Google search document.

Further, even if the content originates from search, the information is a year old and could have changed. Any insights derived from the leaked documentation should not be considered actionable now.

With that in mind, while the full implications remain unknown, here’s what we can glean from the leaked information.

1. Emphasis On User Engagement Metrics

If click data and user engagement metrics are direct ranking factors, as the leaked documents suggest, it could place greater emphasis on optimizing for these metrics.

This means crafting compelling titles and meta descriptions to increase click-through rates, ensuring fast page loads and intuitive navigation to reduce bounces, and strategically linking to keep users engaged on your site.

Driving traffic through other channels like social media and email can also help generate positive engagement signals.

However, it’s important to note that optimizing for user engagement shouldn’t come at the expense of creating reader-focused content. Gaming engagement metrics are unlikely to be a sustainable, long-term strategy.

Google has consistently emphasized the importance of quality and relevance in its public statements, and based on the leaked information, this will likely remain a key focus. Engagement optimization should support and enhance quality content, not replace it.

2. Potential Changes To Link-Building Strategies

The leaked documents contain information about how Google treats different types of links and their impact on search rankings.

This includes details about the use of anchor text, the classification of links into different quality tiers based on traffic to the linking page, and the potential for links to be ignored or demoted based on various spam factors.

If this information is accurate, it could influence how SEO professionals approach link building and the types of links they prioritize.

Links that drive real click-throughs may carry more weight than links on rarely visited pages.

The fundamentals of good link building still apply—create link-worthy content, build genuine relationships, and seek natural, editorially placed links that drive qualified referral traffic.

The leaked information doesn’t change this core approach but offers some additional nuance to be aware of.

3. Increased Focus On Brand Building and Driving Search Demand

The leaked documents suggest that Google uses brand-related signals and offline popularity as ranking factors. This could include metrics like brand mentions, searches for the brand name, and overall brand authority.

As a result, SEO strategies may emphasize building brand awareness and authority through both online and offline channels.

Tactics could include:

  • Securing brand mentions and links from authoritative media sources.
  • Investing in traditional PR, advertising, and sponsorships to increase brand awareness.
  • Encouraging branded searches through other marketing channels.
  • Optimizing for higher search volumes for your brand vs. unbranded keywords.
  • Building engaged social media communities around your brand.
  • Establishing thought leadership through original research, data, and industry contributions.

The idea is to make your brand synonymous with your niche and build an audience that seeks you out directly. The more people search for and engage with your brand, the stronger those brand signals may become in Google’s systems.

4. Adaptation To Vertical-Specific Ranking Factors

Some leaked information suggests that Google may use different ranking factors or algorithms for specific search verticals, such as news, local search, travel, or e-commerce.

If this is the case, SEO strategies may need to adapt to each vertical’s unique ranking signals and user intents.

For example, local search optimization may focus more heavily on factors like Google My Business listings, local reviews, and location-specific content.

Travel SEO could emphasize collecting reviews, optimizing images, and directly providing booking/pricing information on your site.

News SEO requires focusing on timely, newsworthy content and optimized article structure.

While the core principles of search optimization still apply, understanding your particular vertical’s nuances, based on the leaked information and real-world testing, can give you a competitive advantage.

The leaks suggest a vertical-specific approach to SEO could give you an advantage.

Conclusion

The Google API documentation leak has created a vigorous discussion about Google’s ranking systems.

As the SEO community continues to analyze and debate the leaked information, it’s important to remember a few key things:

  1. The information isn’t fully verified and lacks context. Drawing definitive conclusions at this stage is premature.
  2. Google’s ranking algorithms are complex and constantly evolving. Even if entirely accurate, this leak only represents a snapshot in time.
  3. The fundamentals of good SEO – creating high-quality, relevant, user-centric content and promoting it effectively – still apply regardless of the specific ranking factors at play.
  4. Real-world testing and results should always precede theorizing based on incomplete information.

What To Do Next

As an SEO professional, the best course of action is to stay informed about the leak.

Because details about the document remain unknown, it’s not a good idea to consider any takeaways actionable.

Most importantly, remember that chasing algorithms is a losing battle.

The only winning strategy in SEO is to make your website the best result for your message and audience. That’s Google’s endgame, and that’s where your focus should be, regardless of what any particular leaked document suggests.



Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

Google’s AI Overviews Shake Up Ecommerce Search Visibility

Published

on

By

Google's AI Overviews Shake Up Ecommerce Search Visibility

An analysis of 25,000 ecommerce queries by Bartosz Góralewicz, founder of Onely, reveals the impact of Google’s AI overviews on search visibility for online retailers.

The study found that 16% of eCommerce queries now return an AI overview in search results, accounting for 13% of total search volume in this sector.

Notably, 80% of the sources listed in these AI overviews do not rank organically for the original query.

“Ranking #1-3 gives you only an 8% chance of being a source in AI overviews,” Góralewicz stated.

Shift Toward “Accelerated” Product Experiences

International SEO consultant Aleyda Solis analyzed the disconnect between traditional organic ranking and inclusion in AI overviews.

According to Solis, for product-related queries, Google is prioritizing an “accelerated” approach over summarizing currently ranking pages.

She commented Góralewicz’ findings, stating:

“… rather than providing high level summaries of what’s already ranked organically below, what Google does with e-commerce is “accelerate” the experience by already showcasing what the user would get next.”

Solis explains that for queries where Google previously ranked category pages, reviews, and buying guides, it’s now bypassing this level of results with AI overviews.

Assessing AI Overview Traffic Impact

To help retailers evaluate their exposure, Solis has shared a spreadsheet that analyzes the potential traffic impact of AI overviews.

As Góralewicz notes, this could be an initial rollout, speculating that “Google will expand AI overviews for high-cost queries when enabling ads” based on data showing they are currently excluded for high cost-per-click keywords.

An in-depth report across ecommerce and publishing is expected soon from Góralewicz and Onely, with additional insights into this search trend.

Why SEJ Cares

AI overviews represent a shift in how search visibility is achieved for ecommerce websites.

With most overviews currently pulling product data from non-ranking sources, the traditional connection between organic rankings and search traffic is being disrupted.

Retailers may need to adapt their SEO strategies for this new search environment.

How This Can Benefit You

While unsettling for established brands, AI overviews create new opportunities for retailers to gain visibility without competing for the most commercially valuable keywords.

Ecommerce sites can potentially circumvent traditional ranking barriers by optimizing product data and detail pages for Google’s “accelerated” product displays.

The detailed assessment framework provided by Solis enables merchants to audit their exposure and prioritize optimization needs accordingly.


FAQ

What are the key findings from the analysis of AI overviews & ecommerce queries?

Góralewicz’s analysis of 25,000 ecommerce queries found:

  • 16% of ecommerce queries now return an AI overview in the search results.
  • 80% of the sources listed in these AI overviews do not rank organically for the original query.
  • Ranking positions #1-3 only provides an 8% chance of being a source in AI overviews.

These insights reveal significant shifts in how ecommerce sites need to approach search visibility.

Why are AI overviews pulling product data from non-ranking sources, and what does this mean for retailers?

Google’s AI overviews prioritize “accelerated” experiences over summarizing currently ranked pages for product-related queries.

This shift focuses on showcasing directly what users seek instead of traditional organic results.

For retailers, this means:

  • A need to optimize product pages beyond traditional SEO practices, catering to the data requirements of AI overviews.
  • Opportunities to gain visibility without necessarily holding top organic rankings.
  • Potential to bypass traditional ranking barriers by focusing on enhanced product data integration.

Retailers must adapt quickly to remain competitive in this evolving search environment.

What practical steps can retailers take to evaluate and improve their search visibility in light of AI overview disruptions?

Retailers can take several practical steps to evaluate and improve their search visibility:

  • Utilize the spreadsheet provided by Aleyda Solis to assess the potential traffic impact of AI overviews.
  • Optimize product and detail pages to align with the data and presentation style preferred by AI overviews.
  • Continuously monitor changes and updates to AI overviews, adapting strategies based on new data and trends.

These steps can help retailers navigate the impact of AI overviews and maintain or improve their search visibility.


Featured Image: Marco Lazzarini/Shutterstock



Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

Google’s AI Overviews Go Viral, Draw Mainstream Media Scrutiny

Published

on

By

Google's AI Overviews Go Viral, Draw Mainstream Media Scrutiny

Google’s rollout of AI-generated overviews in US search results is taking a disastrous turn, with mainstream media outlets like The New York Times, BBC, and CNBC reporting on numerous inaccuracies and bizarre responses.

On social media, users are sharing endless examples of the feature’s nonsensical and sometimes dangerous output.

From recommending non-toxic glue on pizza to suggesting that eating rocks provides nutritional benefits, the blunders would be amusing if they weren’t so alarming.

Mainstream Media Coverage

As reported by The New York Times, Google’s AI overviews struggle with basic facts, claiming that Barack Obama was the first Muslim president of the United States and stating that Andrew Jackson graduated from college in 2005.

These errors undermine trust in Google’s search engine, which more than two billion people rely on for authoritative information worldwide.

Manual Removal & System Refinements

As reported by The Verge, Google is now scrambling to remove the bizarre AI-generated responses and improve its systems manually.

A Google spokesperson confirmed that the company is taking “swift action” to remove problematic responses and using the examples to refine its AI overview feature.

Google’s Rush To AI Integration

The flawed rollout of AI overviews isn’t an isolated incident for Google.

As CNBC notes in its report, Google made several missteps in a rush to integrate AI into its products.

In February, Google was forced to pause its Gemini chatbot after it generated inaccurate images of historical figures and refused to depict white people in most instances.

Before that, the company’s Bard chatbot faced ridicule for sharing incorrect information about outer space, leading to a $100 billion drop in Google’s market value.

Despite these setbacks, industry experts cited by The New York Times suggest that Google has little choice but to continue advancing AI integration to remain competitive.

However, the challenges of taming large language models, which ingest false information and satirical posts, are now more apparent.

The Debate Over AI In Search

The controversy surrounding AI overviews adds fuel to the debate over the risks and limitations of AI.

While the technology holds potential, these missteps remind everyone that more testing is needed before unleashing it on the public.

The BBC notes that Google’s rivals face similar backlash over their attempts to cram more AI tools into their consumer-facing products.

The UK’s data watchdog is investigating Microsoft after it announced a feature that would take continuous screenshots of users’ online activity.

At the same time, actress Scarlett Johansson criticized OpenAI for using a voice likened to her own without permission.

What This Means For Websites & SEO Professionals

Mainstream media coverage of Google’s erroneous AI overviews brings the issue of declining search quality to public attention.

As the company works to address inaccuracies, the incident serves as a cautionary tale for the entire industry.

Important takeaway: Prioritize responsible use of AI technology to ensure the benefits outweigh its risks.



Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

Trending