Connect with us


How To Automate Ecommerce Category Page Creation With Python



Clustering product inventory and automatically aligning SKUs to search demand is a great way to find opportunities to create new ecommerce categories.

Niche category pages are a proven way for ecommerce sites to align with organic search demand while simultaneously assisting users in purchasing.

If a site stocks a range of products and there is search demand, creating a dedicated landing page is an easy way to align with the demand.

But how can SEO professionals find this opportunity?

Sure, you can eyeball it, but you’ll usually leave a lot of opportunity on the table.

This problem motivated me to script something in Python, which I’m sharing today in a simple to use Streamlit application. (No coding experience required!)

The app linked above created the following output automatically using nothing more than two crawl exports!

Screenshot from Microsoft Excel, May 2022A csv file export showing new subcategories generated automatically using Python

Notice how the suggested categories are automatically tied back to the existing parent category?

A csv export showing that the new subcategories have been tied back to their parent category.Screenshot from Microsoft Excel, May 2022A csv export showing that the new subcategories have been tied back to their parent category.

The app even shows how many products are available to populate the category.

the number of products available to populate the new subcategories have been highlighted.Screenshot from Microsoft Excel, May 2022the number of products available to populate the new subcategories have been highlighted.

Benefits And Uses

  • Improve relevancy to high-demand, competitive queries by creating new landing pages.
  • Increase the chance of relevant site links displaying underneath the parent category.
  • Reduce CPCs to the landing page through increased relevancy.
  • Potential to inform merchandising decisions. (If there is high search demand vs. low product count – there is a potential to widen the range.0
    A mock up image displaying the new categories as sitelinks within the Google search engine.Mock-up Screenshot from Google Chrome, May 2022A mock up image displaying the new categories as sitelinks within the Google search engine.

Creating the suggested subcategories for the parent sofa category would align the site to an additional 3,500 searches per month with relatively little effort.


  • Create subcategory suggestions automatically.
  • Tie subcategories back to the parent category (cuts out a lot of guesswork!).
  • Match to a minimum of X products before recommending a category.
  • Check similarity to an existing category (X % fuzzy match) before recommending a new category.
  • Set minimum search volume/CPC cut-off for category suggestions.
  • Supports search volume and CPC data from multiple countries.

Getting Started/Prepping The Files

To use this app you need two things.

At a high level, the goal is to crawl the target website with two custom extractions.

The internal_html.csv report is exported, along with an inlinks.csv export.

These exports are then uploaded to the Streamlit app, where the opportunities are processed.

Crawl And Extraction Setup

When crawling the site, you’ll need to set two extractions in Screaming Frog – one to uniquely identify product pages and another to uniquely identify category pages.

The Streamlit app understands the difference between the two types of pages when making recommendations for new pages.

The trick is to find a unique element for each page type.

(For a product page, this is usually the price or the returns policy, and for a category page, it’s usually a filter sort element.)

Extracting The Unique Page Elements

Screaming Frog allows for custom extractions of content or code from a web page when crawled.

This section may be daunting if you are unfamiliar with custom extractions, but it’s essential for getting the correct data into the Streamlit app.

The goal is to end up with something looking like the below image.

(A unique extraction for product and category pages with no overlap.)

A screenshot from screaming frog showing two custom extractions to unique identify product and category pagesScreenshot from Screaming Frog SEO Spider, May 2022A screenshot from screaming frog showing two custom extractions to unique identify product and category pages

The steps below walk you through manually extracting the price element for a product page.

Then, repeat for a category page afterward.

If you’re stuck or would like to read more about the web scraper tool in Screaming Frog, the official documentation is worth your time.

Manually Extracting Page Elements

Let’s start by extracting a unique element only found on a product page (usually the price).

Highlight the price element on the page with the mouse, right-click and choose Inspect.

A screenshot demonstrating how to use the inspect element feature of Google Chrome to extract a CSS Selector.Screenshot from Google Chrome, May 2022A screenshot demonstrating how to use the inspect element feature of Google Chrome to extract a CSS Selector.

This will open up the elements window with the correct HTML line already selected.

Right-click the pre-selected line and choose Copy > Copy selector. That’s it!

A screenshot showing how to cop the CSS selector for use in Screaming FrogScreenshot from Google Chrome, May 2022A screenshot showing how to cop the CSS selector for use in Screaming Frog

Open Screaming Frog and paste the copied selector into the custom extraction section. (Configuration > Custom > Extraction).

A screenshot from Screaming Frog showing how to use a custom extractorScreenshot from Screaming Frog SEO Spider, May 2022A screenshot from Screaming Frog showing how to use a custom extractor

Name the extractor as “product,” select the CSSPath drop down and choose Extract Text.

Repeat the process to extract a unique element from a category page. It should look like this once completed for both product and category pages.

A screenshot from Screaming Frog showing the custom extractor correctly populatedScreenshot from Screaming Frog SEO Spider, May 2022A screenshot from Screaming Frog showing the custom extractor correctly populated

Finally, start the crawl.

The crawl should look like this when viewing the Custom Extraction tab.

A screenshot showing unique extractions for product and category pagesScreenshot from Screaming Frog SEO Spider, May 2022A screenshot showing unique extractions for product and category pages

Notice how the extractions are unique to each page type? Perfect.

The script uses the extractor to identify the page type.

Internally the app will convert the extractor to tags.

(I mention this to stress that the extractors can be anything as long as they uniquely identify both page types.)

A screenshot of how the app / script interprets the custom extractions to tag each pageScreenshot from Microsoft Excel, May 2022A screenshot of how the app / script interprets the custom extractions to tag each page

Exporting The Files

Once the crawl has been completed, the last step is to export two types of CSV files.

  • internal_html.csv.
  • inlinks to product pages.

Go to the Custom Extraction tab in Screaming Frog and highlight all URLs that have an extraction for products.

(You will need to sort the column to group it.)

A screenshot showing how to select the inlinks report from Screaming Frog ready for exportingScreenshot from Screaming Frog SEO Spider, May 2022A screenshot showing how to select the inlinks report from Screaming Frog ready for exporting

Lastly, right-click the product URLs, select Export, and then Inlinks.

A screenshot showing how to right click in Screaming Frog to export the inlinks report.Screenshot from Screaming Frog SEO Spider, May 2022A screenshot showing how to right click in Screaming Frog to export the inlinks report.

You should now have a file called inlinks.csv.

Finally, we just need to export the internal_html.csv file.

Click the Internal tab, select HTML from the dropdown menu below and click on the adjacent Export button.

Finally, choose the option to save the file as a .csv

A screenshot in Screaming Frog showing how to export the internal_html.csv reportScreenshot from Screaming Frog SEO Spider, May 2022A screenshot in Screaming Frog showing how to export the internal_html.csv report

Congratulations! You are now ready to use the Streamlit app!

Using The Streamlit App

Using the Streamlit app is relatively simple.

The various options are set to reasonable defaults, but feel free to adjust the cut-offs to better suit your needs.

I would highly recommend using a Keywords Everywhere API key (although it is not strictly necessary as this can be looked up manually later with an existing tool if preferred.

(The script pre-qualifies opportunity by checking for search volume. If the key is missing, the final output will contain more irrelevant words.)

If you want to use a key, this is the section on the left to pay attention to.

A screenshot showing the area to paste in the option Keywords Everywhere API keyScreenshot from, May 2022A screenshot showing the area to paste in the option Keywords Everywhere API key

Once you have entered the API key and adjusted the cut-offs to your links, upload the inlinks.csv crawl.

A screenshot showing how to upload the inlinks.csv report Screenshot from, May 2022A screenshot showing how to upload the inlinks.csv report

Once complete, a new prompt will appear adjacent to it, prompting you to upload the internal_html.csv crawl file.

A screenshot showing how to upload the internal_html.csv reportScreenshot from, May 2022A screenshot showing how to upload the internal_html.csv report

Finally, a new box will appear asking you to select the product and column names from the uploaded crawl file to be mapped correctly.

A screenshot demonstrating how to correct map the column names from the crawlScreenshot from, May 2022A screenshot demonstrating how to correct map the column names from the crawl

Click submit and the script will run. Once complete, you will see the following screen and can download a handy .csv export.

A screenshot showing the Streamlit app after it has successfully run a reportScreenshot from, May 2022A screenshot showing the Streamlit app after it has successfully run a report

How The Script Works

Before we dive into the script’s output, it will help to explain what’s going on under the hood at a high level.

At a glance:

  • Generate thousands of keywords by generating n-grams from product page H1 headings.
  • Qualify keywords by checking whether the word is in an exact or fuzzy match in a product heading.
  • Further qualify keywords by checking for search volume using the Keywords Everywhere API (optional but recommended).
  • Check whether an existing category already exists using a fuzzy match (can find words out of order, different tenses, etc.).
  • Uses the inlinks report to assign suggestions to a parent category automatically.

N-gram Generation

The script creates hundreds of thousands of n-grams from the product page H1s, most of which are completely nonsensical.

In my example for this article, n-grams generated 48,307 words – so this will need to be filtered!

An example of the script generating thousands of nonsensical n-gram combinations.Screenshot from Microsoft Excel, May 2022An example of the script generating thousands of nonsensical n-gram combinations.

The first step in the filtering process is to check whether the keywords generated via n-grams are found at least X times within the product name column.

(This can be in an exact or fuzzy match.)

Anything not found is immediately discarded, which usually removes around 90% of the generated keywords.

The second filtering stage is to check whether the remaining keywords have search demand.

Any keywords without search demand are then discarded too.

(This is why I recommend using the Keywords Everywhere API when running the script, which results in a more refined output.)

It’s worth noting you can do this manually afterward by searching Semrush/Ahrefs etc., discarding any keywords without search volume, and running a VLOOKUP in Microsoft Excel.

Cheaper if you have an existing subscription.

Recommendations Tied To Specific Landing Pages

Once the keyword list has been filtered the script uses the inlinks report to tie the suggested subcategory back to the landing page.

Earlier versions did not do this, but I realized that leveraging the inlinks.csv report meant it was possible.

It really helps understand the context of the suggestion at a glance during QA.

This is the reason the script requires two exports to work correctly.


  • Not checking search volumes will result in more results for QA. (Even if you don’t use the Keywords Everywhere API, I recommend shortlisting by filtering out 0 search volume afterward.)
  • Some irrelevant keywords will have search volume and appear in the final report, even if keyword volume has been checked.
  • Words will typically appear in the singular sense for the final output (because products are singular and categories are pluralized if they sell more than a single product). It’s easy enough to add an “s” to the end of the suggestion though.

User Configurable Variables

I’ve selected what I consider to be sensible default options.

But here is a run down if you’d like to tweak and experiment.

  • Minimum products to match to (exact match) – The minimum number of products that must exist before suggesting the new category in an exact match.
  • Minimum products to match to (fuzzy match) – The minimum number of products that must exist before suggesting the new category in a fuzzy match, (words can be found in any order).
  • Minimum similarity to an existing category – This checks whether a category already exists in a fuzzy match before making the recommendation. The closer to 100 = stricter matching.
  • Minimum CPC in $ – The minimum dollar amount of the suggested category keyword. (Requires the Keywords Everywhere API.)
  • Minimum search volume – The minimum search volume of the suggested category keyword. (Requires Keywords Everywhere API.)
  • Keywords Everywhere API key – Optional, but recommended. Used to pull in CPC/search volume data. (Useful for shortlisting categories.)
  • Set the country to pull search data from – Country-specific search data is available. (Default is the USA.)
  • Set the currency for CPC data – Country-specific CPC data is available. (Default USD.)
  • Keep the longest word suggestion – With similar word suggestions, this option will keep the longest match.
  • Enable fuzzy product matching – This will search for product names in a fuzzy match. (Words can be found out of order, recommended – but slow and CPU intensive.)


With a small amount of preparation, it is possible to tap into a large amount of organic opportunity while improving the user experience.

Although this script was created with an ecommerce focus, according to feedback, it works well for other site types such as job listing sites.

So even if your site isn’t an ecommerce site, it’s still worth a try.

Python enthusiast?

I released the source code for a non-Streamlit version here.

More resources:

Featured Image: patpitchaya/Shutterstock


if( typeof sopp !== “undefined” && sopp === ‘yes’ ){
fbq(‘dataProcessingOptions’, [‘LDU’], 1, 1000);
fbq(‘dataProcessingOptions’, []);

fbq(‘init’, ‘1321385257908563’);

fbq(‘track’, ‘PageView’);

fbq(‘trackSingle’, ‘1321385257908563’, ‘ViewContent’, {
content_name: ‘python-ecommerce-category-pages’,
content_category: ‘ecommerce technical-seo’

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address


Google’s Search Algorithm Exposed in Document Leak



The Search Algorithm Exposed: Inside Google’s Search API Documents Leak

Google’s search algorithm is, essentially, one of the biggest influencers of what gets found on the internet. It decides who gets to be at the top and enjoy the lion’s share of the traffic, and who gets regulated to the dark corners of the web — a.k.a. the 2nd and so on pages of the search results. 

It’s the most consequential system of our digital world. And how that system works has been largely a mystery for years, but no longer. The Google search document leak, just went public just yesterday, drops thousands of pages of purported ranking algorithm factors onto our laps. 

The Leak

There’s some debate as to whether the documentation was “leaked,” or “discovered.” But what we do know is that the API documentation was (likely accidentally) pushed live on GitHub— where it was then found.

The thousands and thousands of pages in these documents, which appear to come from Google’s internal Content API Warehouse, give us an unprecedented look into how Google search and its ranking algorithms work. 

Fast Facts About the Google Search API Documentation

  • Reported to be the internal documentation for Google Search’s Content Warehouse API.
  • The documentation indicates this information is accurate as of March 2024.
  • 2,596 modules are represented in the API documentation with 14,014 attributes. These are what we might call ranking factors or features, but not all attributes may be considered part of the ranking algorithm. 
  • The documentation did not provide how these ranking factors are weighted. 

And here’s the kicker: several factors found on this document were factors that Google has said, on record, they didn’t track and didn’t include in their algorithms. 

That’s invaluable to the SEO industry, and undoubtedly something that will direct how we do SEO for the foreseeable future.

Is The Document Real? 

Another subject of debate is whether these documents are real. On that point, here’s what we know so far:

  • The documentation was on GitHub and was briefly made public from March to May 2024.
  • The documentation contained links to private GitHub repositories and internal pages — these required specific, Google-credentialed logins to access.
  • The documentation uses similar notation styles, formatting, and process/module/feature names and references seen in public Google API documentation.
  • Ex-Googlers say documentation similar to this exists on almost every Google team, i.e., with explanations and definitions for various API attributes and modules.

No doubt Google will deny this is their work (as of writing they refuse to comment on the leak). But all signs, so far, point to this document being the real deal, though I still caution everyone to take everything you learn from it with a grain of salt.

What We Learnt From The Google Search Document Leak

With over 2,500 technical documents to sift through, the insights we have so far are just the tip of the iceberg. I expect that the community will be analyzing this leak for months (possibly years) to gain more SEO-applicable insights.

Other articles have gotten into the nitty-gritty of it already. But if you’re having a hard time understanding all the technical jargon in those breakdowns, here’s a quick and simple summary of the points of interest identified in the leak so far:

  • Google uses something called “Twiddlers.” These are functions that help rerank a page (think boosting or demotion calculations). 
  • Content can be demoted for reasons such as SERP signals (aka user behavior) indicating dissatisfaction, a link not matching the target site, using exact match domains, product reviews, location, or sexual content.
  • Google uses a variety of measurements related to clicks, including “badClicks”, ”goodClicks”, ”lastLongestClicks” and ”unsquashedClicks”.
  • Google keeps a copy of every version of every page it has ever indexed. However, it only uses the last 20 changes of any given URL when analyzing a page.
  • Google uses a domain authority metric, called “siteAuthority
  • Google uses a system called “NavBoost” that uses click data for evaluating pages.
  • Google has a “sandbox” that websites are segregated to, based on age or lack of trust signals. Indicated by an attribute called “hostAge
  • May be related to the last point, but there is an attribute called “smallPersonalSite” in the documentation. Unclear what this is used for.
  • Google does identify entities on a webpage and can sort, rank, and filter them.
  • So far, the only attributes that can be connected to E-E-A-T are author-related attributes.
  • Google uses Chrome data as part of their page quality scoring, with a module featuring a site-level measure of views from Chrome (“chromeInTotal”)
  • The number, diversity, and source of your backlinks matter a lot, even if PageRank has not been mentioned by Google in years.
  • Title tags being keyword-optimized and matching search queries is important.
  • siteFocusScore” attribute measures how much a site is focused on a given topic. 
  • Publish dates and how frequently a page is updated determines content “freshness” — which is also important. 
  • Font size and text weight for links are things that Google notices. It appears that larger links are more positively received by Google.

Author’s Note: This is not the first time a search engine’s ranking algorithm was leaked. I covered the Yandex hack and how it affects SEO in 2023, and you’ll see plenty of similarities in the ranking factors both search engines use.

Action Points for Your SEO

I did my best to review as much of the “ranking features” that were leaked, as well as the original articles by Rand Fishkin and Mike King. From there, I have some insights I want to share with other SEOs and webmasters out there who want to know how to proceed with their SEO.

Links Matter — Link Value Affected by Several Factors 

Links still matter. Shocking? Not really. It’s something I and other SEOs have been saying, even if link-related guidelines barely show up in Google news and updates nowadays.

Still, we need to emphasize link diversity and relevance in our off-page SEO strategies. 

Some insights from the documentation:

  • PageRank of the referring domain’s homepage (also known as Homepage Trust) affects the value of the link.
  • Indexing tier matters. Regularly updated and accessed content is of the highest tier, and provides more value for your rankings.

If you want your off-page SEO to actually do something for your website, then focus on building links from websites that have authority, and from pages that are either fresh or are otherwise featured in the top tier. 

Some PR might help here — news publications tend to drive the best results because of how well they fulfill these factors.

As for guest posts, there’s no clear indication that these will hurt your site, but I definitely would avoid approaching them as a way to game the system. Instead, be discerning about your outreach and treat it as you would if you were networking for new business partners.

Aim for Successful Clicks 

The fact that clicks are a ranking factor should not be a surprise. Despite what Google’s team says, clicks are the clearest indicator of user behavior and how good a page is at fulfilling their search intent.

Google’s whole deal is providing the answers you want, so why wouldn’t they boost pages that seem to do just that?

The core of your strategy should be creating great user experiences. Great content that provides users with the right answers is how you do that. Aiming for qualified traffic is how you do that. Building a great-looking, functioning website is how you do that.

Go beyond just picking clickbait title tags and meta descriptions, and focus on making sure users get what they need from your website.

Author’s Note: If you haven’t been paying attention to page quality since the concepts of E-E-A-T and the HCU were introduced, now is the time to do so. Here’s my guide to ranking for the HCU to help you get started.

Keep Pages Updated

An interesting click-based measurement is the “last good click.” That being in a module related to indexing signals suggests that content decay can affect your rankings. 

Be vigilant about which pages on your website are not driving the expected amount of clicks for its SERP position. Outdated posts should be audited to ensure content has up-to-date and accurate information to help users in their search journey. 

This should revive those posts and drive clicks, preventing content decay. 

It’s especially important to start on this if you have content pillars on your website that aren’t driving the same traffic as they used to.

Establish Expertise & Authority  

Google does notice the entities on a webpage, which include a bunch of things, but what I want to focus on are those related to your authors.

E-E-A-T as a concept is pretty nebulous — because scoring “expertise” and “authority” of a website and its authors is nebulous. So, a lot of SEOs have been skeptical about it.

However, the presence of an “author” attribute combined with the in-depth mapping of entities in the documentation shows there is some weight to having a well-established author on your website.

So, apply author markups, create an author bio page and archive, and showcase your official profiles on your website to prove your expertise. 

Build Your Domain Authority

After countless Q&As and interviews where statements like “we don’t have anything like domain authority,” and “we don’t have website authority score,” were thrown around, we find there does exist an attribute called “siteAuthority”.

Though we don’t know specifically how this measure is computed, and how it weighs in the overall scoring for your website, we know it does matter to your rankings.

So, what do you need to do to improve site authority? It’s simple — keep following best practices and white-hat SEO, and you should be able to grow your authority within your niche. 

Stick to Your Niche

Speaking of niches — I found the “siteFocusScore” attribute interesting. It appears that building more and more content within a specific topic is considered a positive.

It’s something other SEOs have hypothesized before. After all, the more you write about a topic, the more you must be an authority on that topic, right?

But anyone can write tons of blogs on a given topic nowadays with AI, so how do you stand out (and avoid the risk of sounding artificial and spammy?)

That’s where author entities and link-building come in. I do think that great content should be supplemented by link-building efforts, as a sort of way to show that hey, “I’m an authority with these credentials, and these other people think I’m an authority on the topic as well.”

Key Takeaway

Most of the insights from the Google search document leak are things that SEOs have been working on for months (if not years). However, we now have solid evidence behind a lot of our hunches, providing that our theories are in fact best practices. 

The biggest takeaway I have from this leak: Google relies on user behavior (click data and post-click behavior in particular) to find the best content. Other ranking factors supplement that. Optimize to get users to click on and then stay on your page, and you should see benefits to your rankings.

Could Google remove these ranking factors now that they’ve been leaked? They could, but it’s highly unlikely that they’ll remove vital attributes in the algorithm they’ve spent years building. 

So my advice is to follow these now validated SEO practices and be very critical about any Google statements that follow this leak.

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading


Google Search Leak: Conflicting Signals, Unanswered Questions




Google Search Leak: Conflicting Signals, Unanswered Questions

An apparent leak of Google Search API documentation has sparked intense debate within the SEO community, with some claiming it proves Google’s dishonesty and others urging caution in interpreting the information.

As the industry grapples with the allegations, a balanced examination of Google’s statements and the perspectives of SEO experts is crucial to understanding the whole picture.

Leaked Documents Vs. Google’s Public Statements

Over the years, Google has consistently maintained that specific ranking signals, such as click data and user engagement metrics, aren’t used directly in its search algorithms.

In public statements and interviews, Google representatives have emphasized the importance of relevance, quality, and user experience while denying the use of specific metrics like click-through rates or bounce rates as ranking-related factors.

However, the leaked API documentation appears to contradict these statements.

It contains references to features like “goodClicks,” “badClicks,” “lastLongestClicks,” impressions, and unicorn clicks, tied to systems called Navboost and Glue, which Google VP Pandu Nayak confirmed in DOJ testimony are parts of Google’s ranking systems.

The documentation also alleges that Google calculates several metrics using Chrome browser data on individual pages and entire domains, suggesting the full clickstream of Chrome users is being leveraged to influence search rankings.

This contradicts past Google statements that Chrome data isn’t used for organic searches.

The Leak’s Origins & Authenticity

Erfan Azimi, CEO of digital marketing agency EA Eagle Digital, alleges he obtained the documents and shared them with Rand Fishkin and Mike King.

Azimi claims to have spoken with ex-Google Search employees who confirmed the authenticity of the information but declined to go on record due to the situation’s sensitivity.

While the leak’s origins remain somewhat ambiguous, several ex-Googlers who reviewed the documents have stated they appear legitimate.

Fishkin states:

“A critical next step in the process was verifying the authenticity of the API Content Warehouse documents. So, I reached out to some ex-Googler friends, shared the leaked docs, and asked for their thoughts.”

Three ex-Googlers responded, with one stating, “It has all the hallmarks of an internal Google API.”

However, without direct confirmation from Google, the authenticity of the leaked information is still debatable. Google has not yet publicly commented on the leak.

It’s important to note that, according to Fishkin’s article, none of the ex-Googlers confirmed that the leaked data was from Google Search. Only that it appears to have originated from within Google.

Industry Perspectives & Analysis

Many in the SEO community have long suspected that Google’s public statements don’t tell the whole story. The leaked API documentation has only fueled these suspicions.

Fishkin and King argue that if the information is accurate, it could have significant implications for SEO strategies and website search optimization.

Key takeaways from their analysis include:

  • Navboost and the use of clicks, CTR, long vs. Short clicks, and user data from Chrome appear to be among Google’s most powerful ranking signals.
  • Google employs safelists for sensitive topics like COVID-19, elections, and travel to control what sites appear.
  • Google uses Quality Rater feedback and ratings in its ranking systems, not just as a training set.
  • Click data influences how Google weights links for ranking purposes.
  • Classic ranking factors like PageRank and anchor text are losing influence compared to more user-centric signals.
  • Building a brand and generating search demand is more critical than ever for SEO success.

However, just because something is mentioned in API documentation doesn’t mean it’s being used to rank search results.

Other industry experts urge caution when interpreting the leaked documents.

They point out that Google may use the information for testing purposes or apply it only to specific search verticals rather than use it as active ranking signals.

There are also open questions about how much weight these signals carry compared to other ranking factors. The leak doesn’t provide the full context or algorithm details.

Unanswered Questions & Future Implications

As the SEO community continues to analyze the leaked documents, many questions still need to be answered.

Without official confirmation from Google, the authenticity and context of the information are still a matter of debate.

Key open questions include:

  • How much of this documented data is actively used to rank search results?
  • What is the relative weighting and importance of these signals compared to other ranking factors?
  • How have Google’s systems and use of this data evolved?
  • Will Google change its public messaging and be more transparent about using behavioral data?

As the debate surrounding the leak continues, it’s wise to approach the information with a balanced, objective mindset.

Unquestioningly accepting the leak as gospel truth or completely dismissing it are both shortsighted reactions. The reality likely lies somewhere in between.

Potential Implications For SEO Strategies and Website Optimization

It would be highly inadvisable to act on information shared from this supposed ‘leak’ without confirming whether it’s an actual Google search document.

Further, even if the content originates from search, the information is a year old and could have changed. Any insights derived from the leaked documentation should not be considered actionable now.

With that in mind, while the full implications remain unknown, here’s what we can glean from the leaked information.

1. Emphasis On User Engagement Metrics

If click data and user engagement metrics are direct ranking factors, as the leaked documents suggest, it could place greater emphasis on optimizing for these metrics.

This means crafting compelling titles and meta descriptions to increase click-through rates, ensuring fast page loads and intuitive navigation to reduce bounces, and strategically linking to keep users engaged on your site.

Driving traffic through other channels like social media and email can also help generate positive engagement signals.

However, it’s important to note that optimizing for user engagement shouldn’t come at the expense of creating reader-focused content. Gaming engagement metrics are unlikely to be a sustainable, long-term strategy.

Google has consistently emphasized the importance of quality and relevance in its public statements, and based on the leaked information, this will likely remain a key focus. Engagement optimization should support and enhance quality content, not replace it.

2. Potential Changes To Link-Building Strategies

The leaked documents contain information about how Google treats different types of links and their impact on search rankings.

This includes details about the use of anchor text, the classification of links into different quality tiers based on traffic to the linking page, and the potential for links to be ignored or demoted based on various spam factors.

If this information is accurate, it could influence how SEO professionals approach link building and the types of links they prioritize.

Links that drive real click-throughs may carry more weight than links on rarely visited pages.

The fundamentals of good link building still apply—create link-worthy content, build genuine relationships, and seek natural, editorially placed links that drive qualified referral traffic.

The leaked information doesn’t change this core approach but offers some additional nuance to be aware of.

3. Increased Focus On Brand Building and Driving Search Demand

The leaked documents suggest that Google uses brand-related signals and offline popularity as ranking factors. This could include metrics like brand mentions, searches for the brand name, and overall brand authority.

As a result, SEO strategies may emphasize building brand awareness and authority through both online and offline channels.

Tactics could include:

  • Securing brand mentions and links from authoritative media sources.
  • Investing in traditional PR, advertising, and sponsorships to increase brand awareness.
  • Encouraging branded searches through other marketing channels.
  • Optimizing for higher search volumes for your brand vs. unbranded keywords.
  • Building engaged social media communities around your brand.
  • Establishing thought leadership through original research, data, and industry contributions.

The idea is to make your brand synonymous with your niche and build an audience that seeks you out directly. The more people search for and engage with your brand, the stronger those brand signals may become in Google’s systems.

4. Adaptation To Vertical-Specific Ranking Factors

Some leaked information suggests that Google may use different ranking factors or algorithms for specific search verticals, such as news, local search, travel, or e-commerce.

If this is the case, SEO strategies may need to adapt to each vertical’s unique ranking signals and user intents.

For example, local search optimization may focus more heavily on factors like Google My Business listings, local reviews, and location-specific content.

Travel SEO could emphasize collecting reviews, optimizing images, and directly providing booking/pricing information on your site.

News SEO requires focusing on timely, newsworthy content and optimized article structure.

While the core principles of search optimization still apply, understanding your particular vertical’s nuances, based on the leaked information and real-world testing, can give you a competitive advantage.

The leaks suggest a vertical-specific approach to SEO could give you an advantage.


The Google API documentation leak has created a vigorous discussion about Google’s ranking systems.

As the SEO community continues to analyze and debate the leaked information, it’s important to remember a few key things:

  1. The information isn’t fully verified and lacks context. Drawing definitive conclusions at this stage is premature.
  2. Google’s ranking algorithms are complex and constantly evolving. Even if entirely accurate, this leak only represents a snapshot in time.
  3. The fundamentals of good SEO – creating high-quality, relevant, user-centric content and promoting it effectively – still apply regardless of the specific ranking factors at play.
  4. Real-world testing and results should always precede theorizing based on incomplete information.

What To Do Next

As an SEO professional, the best course of action is to stay informed about the leak.

Because details about the document remain unknown, it’s not a good idea to consider any takeaways actionable.

Most importantly, remember that chasing algorithms is a losing battle.

The only winning strategy in SEO is to make your website the best result for your message and audience. That’s Google’s endgame, and that’s where your focus should be, regardless of what any particular leaked document suggests.

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading


Google’s AI Overviews Shake Up Ecommerce Search Visibility




Google's AI Overviews Shake Up Ecommerce Search Visibility

An analysis of 25,000 ecommerce queries by Bartosz Góralewicz, founder of Onely, reveals the impact of Google’s AI overviews on search visibility for online retailers.

The study found that 16% of eCommerce queries now return an AI overview in search results, accounting for 13% of total search volume in this sector.

Notably, 80% of the sources listed in these AI overviews do not rank organically for the original query.

“Ranking #1-3 gives you only an 8% chance of being a source in AI overviews,” Góralewicz stated.

Shift Toward “Accelerated” Product Experiences

International SEO consultant Aleyda Solis analyzed the disconnect between traditional organic ranking and inclusion in AI overviews.

According to Solis, for product-related queries, Google is prioritizing an “accelerated” approach over summarizing currently ranking pages.

She commented Góralewicz’ findings, stating:

“… rather than providing high level summaries of what’s already ranked organically below, what Google does with e-commerce is “accelerate” the experience by already showcasing what the user would get next.”

Solis explains that for queries where Google previously ranked category pages, reviews, and buying guides, it’s now bypassing this level of results with AI overviews.

Assessing AI Overview Traffic Impact

To help retailers evaluate their exposure, Solis has shared a spreadsheet that analyzes the potential traffic impact of AI overviews.

As Góralewicz notes, this could be an initial rollout, speculating that “Google will expand AI overviews for high-cost queries when enabling ads” based on data showing they are currently excluded for high cost-per-click keywords.

An in-depth report across ecommerce and publishing is expected soon from Góralewicz and Onely, with additional insights into this search trend.

Why SEJ Cares

AI overviews represent a shift in how search visibility is achieved for ecommerce websites.

With most overviews currently pulling product data from non-ranking sources, the traditional connection between organic rankings and search traffic is being disrupted.

Retailers may need to adapt their SEO strategies for this new search environment.

How This Can Benefit You

While unsettling for established brands, AI overviews create new opportunities for retailers to gain visibility without competing for the most commercially valuable keywords.

Ecommerce sites can potentially circumvent traditional ranking barriers by optimizing product data and detail pages for Google’s “accelerated” product displays.

The detailed assessment framework provided by Solis enables merchants to audit their exposure and prioritize optimization needs accordingly.


What are the key findings from the analysis of AI overviews & ecommerce queries?

Góralewicz’s analysis of 25,000 ecommerce queries found:

  • 16% of ecommerce queries now return an AI overview in the search results.
  • 80% of the sources listed in these AI overviews do not rank organically for the original query.
  • Ranking positions #1-3 only provides an 8% chance of being a source in AI overviews.

These insights reveal significant shifts in how ecommerce sites need to approach search visibility.

Why are AI overviews pulling product data from non-ranking sources, and what does this mean for retailers?

Google’s AI overviews prioritize “accelerated” experiences over summarizing currently ranked pages for product-related queries.

This shift focuses on showcasing directly what users seek instead of traditional organic results.

For retailers, this means:

  • A need to optimize product pages beyond traditional SEO practices, catering to the data requirements of AI overviews.
  • Opportunities to gain visibility without necessarily holding top organic rankings.
  • Potential to bypass traditional ranking barriers by focusing on enhanced product data integration.

Retailers must adapt quickly to remain competitive in this evolving search environment.

What practical steps can retailers take to evaluate and improve their search visibility in light of AI overview disruptions?

Retailers can take several practical steps to evaluate and improve their search visibility:

  • Utilize the spreadsheet provided by Aleyda Solis to assess the potential traffic impact of AI overviews.
  • Optimize product and detail pages to align with the data and presentation style preferred by AI overviews.
  • Continuously monitor changes and updates to AI overviews, adapting strategies based on new data and trends.

These steps can help retailers navigate the impact of AI overviews and maintain or improve their search visibility.

Featured Image: Marco Lazzarini/Shutterstock

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading