Connect with us

SEO

How To Use IndexNow API With Python For Bulk Indexing

Published

on

How To Use IndexNow API With Python For Bulk Indexing


IndexNow is a protocol developed by Microsoft Bing and adopted by Yandex that enables webmasters and SEO pros to easily notify search engines when a webpage has been updated via an API.

And today, Microsoft announced that it is making the protocol easier to implement by ensuring that submitted URLs are shared between search engines.

Given its positive implications and the promise of a faster indexing experience for publishers, the IndexNow API should be on every SEO professional’s radar.

Using Python for automating URL submission to the IndexNow API or making an API request to the IndexNow API for bulk URL indexing can make managing IndexNow more efficient for you.

In this tutorial, you’ll learn how to do just that, with step-by-step instructions for using the IndexNow API to submit URLs to Microsoft Bing in bulk with Python.

Note: The IndexNow API is similar to Google’s Indexing API with only one difference: the Google Indexing API is only for job advertisements or broadcasting web pages that contain a video object within it.

Google announced that they will test the IndexNow API but hasn’t updated us since.

Bulk Indexing Using IndexNow API with Python: Getting Started

Below are the necessities to understand and implement the IndexNow API tutorial.

Below are the Python packages and libraries that will be used for the Python IndexNow API tutorial.

  • Advertools (must).
  • Pandas (must).
  • Requests (must).
  • Time (optional).
  • JSON (optional).

Before getting started, reading the basics can help you to understand this IndexNow API and Python tutorial better. We will be using an API Key and a .txt file to provide authentication along with specific HTTP Headers.

IndexNow API Usage Steps with Python.

1. Import The Python Libraries

To use the necessary Python libraries, we will use the “import” command.

  • Advertools will be used for sitemap URL extraction.
  • Requests will be used for making the GET and POST requests.
  • Pandas will be used for taking the URLs in the sitemap into a list object.
  • The “time” module is to prevent a “Too much request” error with the “sleep()” method.
  • JSON is for possibly modifying the POST JSON object if needed.

Below, you will find all of the necessary import lines for the IndexNow API tutorial.

import advertools as adv
import pandas as pd
import requests
import json
import time

2. Extracting The Sitemap URLs With Python

To extract the URLs from a sitemap file, different web scraping methods and libraries can be used such as Requests or Scrapy.

But to keep things simple and efficient, I will use my favorite Python SEO package – Advertools.

With only a single line of code, all of the URLs within a sitemap can be extracted.

sitemap_urls = adv.sitemap_to_df("https://www.example.com/sitemap_index.xml")

The “sitemap_to_df” method of the Advertools can extract all the URLs and other sitemap-related tags such as “lastmod” or “priority.”

Below, you can see the output of the “adv.sitemap_to_df” command.

Sitemap URL Extraction for IndexNow API UsageSitemap URL Extraction can be done via Advertools’ “sitemap_to_df” method.

All of the URLs and dates are specified within the “sitemap_urls” variable.

Since sitemaps are useful sources for search engines and SEOs, Advertools’ sitemap_to_df method can be used for many different tasks including a Sitemap Python Audit.

But that’s a topic for another time.

3. Take The URLs Into A List Object With “to_list()”

Python’s Pandas library has a method for taking a data frame column (data series) into a list object, to_list().

Below is an example usage:

sitemap_urls["loc"].to_list()

Below, you can see the result:

Sitemap URL ListingPandas’ “to_list” method can be used with Advertools for listing the URLs.

All URLs within the sitemap are in a Python list object.

See also  Google FAQ Provides Core Web Vitals Insights

4. Understand The URL Syntax Of IndexNow API Of Microsoft Bing

Let’s take a look at the URL syntax of the IndexNow API.

Here’s an example:

https://<searchengine>/indexnow?url=url-changed&key=your-key

The URL syntax represents the variables and their relations to each other within the RFC 3986 standards.

  • The <searchengine> represents the search engine name that you will use the IndexNow API for.
  • “?url=” parameter is to determine the URL that will be submitted to the search engine via IndexNow API.
  • “&key=” is the API Key that will be used within the IndexNow API.
  • “&keyLocation=” is to provide an authenticity that shows that you are the owner of the website that IndexNow API will be used for.

The “&keyLocation” will bring us to the API Key and its “.txt” version.

5. Gather The API Key For IndexNow And Upload It To The Root

You’ll need a valid key to use the IndexNow API.

Use this link to generate the Microsoft Bing IndexNow API Key.

IndexNow API Key Taking There is no limit for generating the IndexNow API Key.

Clicking the “Generate” button creates an IndexNow API Key.

When you click on the download button, it will download the “.txt” version of the IndexNow API Key.

IndexNow API Key GenerationIndexNow API Key can be generated by Microsoft Bing’s stated address.
txt version of IndexNow API KeyDownloaded IndexNow API Key as txt file.

The TXT version of the API key will be the file name and as well as within the text file.

IndexNow API Key in TXT FileIndexNow API Key in TXT File should be the same with the name of the file, and the actual API Key value.

The next step is uploading this TXT file to the root of the website’s server.

Since I use FileZilla for my FTP, I have uploaded it easily to my web server’s root.

Root Server and IndexNow API Set upBy putting the .txt file into the web server’s root folder, the IndexNow API setup can be completed.

The next step is performing a simple for a loop example for submitting all of the URLs within the sitemap.

6. Submit The URLs Within The Sitemap With Python To IndexNow API

To submit a single URL to the IndexNow, you can use a single “requests.get()” instance. But to make it more useful, we will use a for a loop.

To submit URLs in bulk to the IndexNow API with Python, follow the steps below:

  1. Create a key variable with the IndexNow API Key value.
  2. Replace the <searchengine> section with the search engine that you want to submit URLs (Microsoft Bing, or Yandex, for now).
  3. Assign all of the URLs from the sitemap within a list to a variable.
  4. Use the “txt” file within the root of the web server with its URL value.
  5. Place the URL, key, and key location URL within the string manipulation value.
  6. Start your for a loop, and use the “requests.get()” for all of the URLs within the sitemap.

Below, you can see the implementation:

key = "22bc7c564b334f38b0b1ed90eec8f2c5"
url = sitemap_urls["loc"].to_list()
for i in url:
          endpoint = f"https://bing.com/indexnow?url={i}&key={key}&keyLocation={location}"
          response = requests.get(endpoint)
          print(i)
          print(endpoint)
          print(response.status_code, response.content)
          #time.sleep(5)

If you’re concerned about sending too many requests to the IndexNow API, you can use the Python time module to make the script wait between every request.

Here you can see the output of the script:

IndexNow API Automation ScriptThe empty string as the request’s response body represents the success of the IndexNow API request according to Microsoft Bing’s IndexNow documentation.

The 200 Status Code means that the request was successful.

See also  IndexNow Now Sharing URLs Between Search Engines & IndexNow API

With the for a loop, I have submitted 194 URLs to Microsoft Bing.

According to the IndexNow Documentation, the HTTP 200 Response Code signals that the search engine is aware of the change in the content or the new content. But it doesn’t necessarily guarantee indexing.

For instance, I have used the same script for another website. After 120 seconds, Microsoft Bing says that 31 results are found. And conveniently, it shows four pages.

The only problem is that on the first page there are only two results, and it says that the URLs are blocked by Robots.txt even if the blocking was removed before submission.

This can happen if the robots.txt was changed to remove some URLs before using the IndexNow API because it seems that Bing does not check the Robots.txt again.

Thus, if you previously blocked them, they try to index your website but still use the previous version of the robots.txt file.

Bing IndexNow API ResultsIt shows what will happen if you use IndexNow API by blocking Bingbot via Robots.txt.

On the second page, there is only one result:

IndexNow Bing Paginated ResultMicrosoft Bing might use a different indexation and pagination method than Google. The second page shows only one among the 31 results.

On the third page, there is no result, and it shows the Microsoft Bing Translate for translating the string within the search bar.

Microsoft Bing TranslateIt shows sometimes, Microsoft Bing infers the “site” search operator as a part of the query.

When I checked Google Analytics, it shows that Bing still hadn’t crawled the website or indexed it. I know this is true as I also checked the log files.

Google and Bing Indexing ProcessesBelow, you will see the Bing Webmaster Tool’s report for the example website:

Bing Webmaster Tools Report

It says that I submitted 38 URLs.

The next step will involve the bulk request with the POST Method and a JSON object.

7. Perform An HTTP Post Request To The IndexNow API

To perform an HTTP post request to the IndexNow API for a set of URLs, a JSON object should be used with specific properties.

  • Host property represents the search engine hostname.
  • Key represents the API Key.
  • Key represents the location of the API Key’s txt file within the web server.
  • urlList represents the URL set that will be submitted to the IndexNow API.
  • Headers represent the POST Request Headers that will be used which are “Content-type” and “charset.”

Since this is a POST request, the “requests.post” will be used instead of the “requests.get().”

Below, you will find an example of a set of URLs submitted to Microsoft Bing’s IndexNow API.

data = {
  "host": "www.bing.com",
  "key": "22bc7c564b334f38b0b1ed90eec8f2c5",
  "keyLocation": "https://www.example.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt",
  "urlList": [
    'https://www.example.com/technical-seo/http-header/',
    'https://www.example.com/python-seo/nltk/lemmatize',
    'https://www.example.com/pagespeed/broser-hints/preload',
    'https://www.example.com/python-seo/nltk/stemming',
    'https://www.example.com/python-seo/categorize-queries/',
    'https://www.example.com/python-seo/nltk/tokenization',
    'https://www.example.com/review/oncrawl/',
    'https://www.example.com/technical-seo/hreflang/',
    'https://www.example.com/technical-seo/multilingual-seo/'
      ]
}
headers = {"Content-type":"application/json", "charset":"utf-8"}
r = requests.post("https://bing.com/", data=data, headers=headers)
r.status_code, r.content

In the example above, we have performed a POST Request to index a set of URLs.

We have used the “data” object for the “data parameter of requests.post,” and the headers object for the “headers” parameter.

Since we POST a JSON object, the request should have the “content-type: application/json” key and value with the “charset:utf-8.”

After I make the POST request, 135 seconds later, my live logfile analysis dashboard started to show the immediate hits from the Bingbot.

Bingbot Log File Analysis

8. Create Custom Function For IndexNow API To Make Time

Creating a custom function for IndexNow API is useful to decrease the time that will be spent on the code preparation.

See also  SEO and The Pandemic: Adapt Your Marketing During COVID-19

Thus, I have created two different custom Python functions to use the IndexNow API for bulk requests and individual requests.

Below, you will find an example for only the bulk requests to the IndexNow API.

The custom function for bulk requests is called “submit_url_set.”

Even if you just fill in the parameters, still you will be able to use it properly.

def submit_url_set(set_:list, key, location, host="https://www.bing.com", headers={"Content-type":"application/json", "charset":"utf-8"}):
     key = "22bc7c564b334f38b0b1ed90eec8f2c5"
     set_ = sitemap_urls["loc"].to_list()
     data = {
     "host": "www.bing.com",
     "key": key,
     "keyLocation": "https://www.example.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt",
     "urlList": set_
     }
     r = requests.post(host, data=data, headers=headers)
     return r.status_code

An explanation of this custom function:

  • The “Set_” parameter is to provide a list of URLs.
  • “Key” parameter is to provide an IndexNow API Key.
  • “Location” parameter is to provide the location of the IndexNow API Key’s txt file within the web server.
  • “Host” is to provide the search engine host address.
  • “Headers” is to provide the headers that are necessary for the IndexNow API.

I have defined some of the parameters with default values such as “host” for Microsoft Bing. If you want to use it for Yandex, you will need to state it while calling the function.

Below is an example usage:

submit_url_set(set_=sitemap_urls["loc"].to_list(), key="22bc7c564b334f38b0b1ed90eec8f2c5", location="https://www.example.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt")

If you want to extract sitemap URLs with a different method, or if you want to use the IndexNow API for a different URL set, you will need to change “set_” parameter value.

Below, you will see an example of the Custom Python function for the IndexNow API for only individual requests.

def submit_url(url, location, key = "22bc7c564b334f38b0b1ed90eec8f2c5"):
     key = "22bc7c564b334f38b0b1ed90eec8f2c5"
     url = sitemap_urls["loc"].to_list()
     for i in url:
          endpoint = f"https://bing.com/indexnow?url={i}&key={key}&keyLocation={location}"
          response = requests.get(endpoint)
          print(i)
          print(endpoint)
          print(response.status_code, response.content)
          #time.sleep(5)

Since this is for a loop, you can submit more URLs one by one. The search engine can prioritize these types of requests differently.

Some of the bulk requests will include non-important URLs, the individual requests might be seen as more reasonable.

If you want to include the sitemap URL extraction within the function, you should include Advertools naturally into the functions themselves.

Tips For Using The IndexNow API With Python

An Overview of How The IndexNow API Works, Capabilities & Uses

  • The IndexNow API doesn’t guarantee that your website or the URLs that you submitted will be indexed.
  • You should only submit URLs that are new or for which the content has changed.
  • The IndexNow API impacts the crawl budget.
  • Microsoft Bing has a threshold for the URL Content Quality and Calculation of the Crawl Need for a URL. If the submitted URL is not good enough, they may not crawl it.
  • You can submit up to 10,000 URLs.
  • The IndexNow API suggests submitting URLs even if the website is small.
  • Submitting the same pages many times within a day can block the IndexNow API from crawling the redundant URLs or the source.
  • The IndexNow API is useful for sites where the content changes frequently, like every 10 minutes.
  • IndexNow API is useful for pages that are gone and are returning a 404 response code. It lets the search engine know that the URLs are gone.
  • IndexNow API can be used for notifying of new 301 or 302 redirects.
  • The 200 Status Response Code means that the search engine is aware of the submitted URL.
  • The 429 Status Code means that you made too many requests to the IndexNow API.
  • If you put a “txt” file that contains the IndexNow API Key into a subfolder, the IndexNow API can be used only for that subfolder.
  • If you have two different CMS, you can use two different IndexNow API Keys for two different site sections
  • Subdomains need to use a different IndexNow API key.
  • Even if you already use a sitemap, using IndexNow API is useful because it efficiently tells the search engines of website changes and reduces unnecessary bot crawling.
  • All search engines that adopt the IndexNow API (Microsoft Bing and Yandex) share the URLs that are submitted between each other.
IndexNow API Infographic SEOIndexNow API Documentation and usage tips can be found above.

In this IndexNow API tutorial and guideline with Python, we have examined a new search engine technology.

Instead of waiting to be crawled, publishers can notify the search engines to crawl when there is a need.

IndexNow reduces the use of search engine data center resources, and now you know how to use Python to make the process more efficient, too.

More resources:

An Introduction To Python & Machine Learning For Technical SEO

How to Use Python to Monitor & Measure Website Performance

Advanced Technical SEO: A Complete Guide


Featured Image: metamorworks/Shutterstock





Source link

SEO

Are Contextual Links A Google Ranking Factor?

Published

on

Are Contextual Links A Google Ranking Factor?


Inbound links are a ranking signal that can vary greatly in terms of how they’re weighted by Google.

One of the key attributes that experts say can separate a high value link from a low value link is the context in which it appears.

When a link is placed within relevant content, it’s thought to have a greater impact on rankings than a link randomly inserted within unrelated text.

Is there any bearing to that claim?

Let’s dive deeper into what has been said about contextual links as a ranking factor to see whether there’s any evidence to support those claims.

The Claim: Contextual Links Are A Ranking Factor

A “contextual link” refers to an inbound link pointing to a URL that’s relevant to the content in which the link appears.

When an article links to a source to provide additional context for the reader, for example, that’s a contextual link.

Contextual links add value rather than being a distraction.

They should flow naturally with the content, giving the reader some clues about the page they’re being directed to.

Not to be confused with anchor text, which refers to the clickable part of a link, a contextual link is defined by the surrounding text.

A link’s anchor text could be related to the webpage it’s pointing to, but if it’s surrounded by content that’s otherwise irrelevant then it doesn’t qualify as a contextual link.

Contextual links are said to be a Google ranking factor, with claims that they’re weighted higher by the search engine than other types of links.

One of the reasons why Google might care about context when it comes to links is because of the experience it creates for users.

See also  Twitter introduces a new label that allows the ‘good bots’ to identify themselves

When a user clicks a link and lands on a page related to what they were previously looking at, it’s a better experience than getting directed to a webpage they aren’t interested in.

Modern guides to link building all recommend getting links from relevant URLs, as opposed to going out and placing links anywhere that will take them.

There’s now a greater emphasis on quality over quantity when it comes to link building, and a link is considered higher quality when its placement makes sense in context.

One high quality contextual link can, in theory, be worth more than multiple lower quality links.

That’s why experts advise site owners to gain at least a few contextual links, as that will get them further than building dozens of random links.

If Google weights the quality of links higher or lower based on context, it would mean Google’s crawlers can understand webpages and assess how closely they relate to other URLs on the web.

Is there any evidence to support this?

The Evidence For Contextual Links As A Ranking Factor

Evidence in support of contextual links as a ranking factor can be traced back to 2012 with the launch of the Penguin algorithm update.

Google’s original algorithm, PageRank, was built entirely on links. The more links pointing to a website, the more authority it was considered to have.

Websites could catapult their site up to the top of Google’s search results by building as many links as possible. It didn’t matter if the links were contextual or arbitrary.

Google’s PageRank algorithm wasn’t as selective about which links it valued (or devalued) over others until it was augmented with the Penguin update.

See also  Stipop offers developers and creators instant access to a huge global sticker library

Penguin brought a number of changes to Google’s algorithm that made it more difficult to manipulate search rankings through spammy link building practices.

In Google’s announcement of the launch of Penguin, former search engineer Matt Cutts highlighted a specific example of the link spam it’s designed to target.

This example depicts the exact opposite of a contextual link, with Cutts saying:

“Here’s an example of a site with unusual linking patterns that is also affected by this change. Notice that if you try to read the text aloud you’ll discover that the outgoing links are completely unrelated to the actual content, and in fact, the page text has been “spun” beyond recognition.”

A contextual link, on the other hand, looks like the one a few paragraphs above linking to Google’s blog post.

Links with context share the following characteristics:

  • Placement fits in naturally with the content.
  • Linked URL is relevant to the article.
  • Reader knows where they’re going when they click on it.

All of the documentation Google has published about Penguin over the years is the strongest evidence available in support of contextual links as a ranking factor.

See: A Complete Guide to the Google Penguin Algorithm Update

Google will never outright say “contextual link building is a ranking factor,” however, because the company discourages any deliberate link building at all.

As Cutts adds at the end of his Penguin announcement, Google would prefer to see webpages acquire links organically:

“We want people doing white hat search engine optimization (or even no search engine optimization at all) to be free to focus on creating amazing, compelling web sites.”

Contextual Links Are A Ranking Factor: Our Verdict

See also  What It (Really) Is & How to Fix It

Contextual links are probably a Google ranking factor.

A link is weighted higher when it’s used in context than if it’s randomly placed within unrelated content.

But that doesn’t necessarily mean links without context will negatively impact a site’s rankings.

External links are largely outside a site owner’s control.

If a website links to you out of context it’s not a cause for concern, because Google is capable of ignoring low value links.

On the other hand, if Google detects a pattern of unnatural links, then that could count against a site’s rankings.

If you have actively engaged in non-contextual link building in the past, it may be wise to consider using the disavow tool.


Featured Image: Paulo Bobita/Search Engine Journal





Source link

Continue Reading

SEO

Is It A Google Ranking Factor?

Published

on

Is It A Google Ranking Factor?


Latent semantic indexing (LSI) is an indexing and information retrieval method used to identify patterns in the relationships between terms and concepts.

With LSI, a mathematical technique is used to find semantically related terms within a collection of text (an index) where those relationships might otherwise be hidden (or latent).

And in that context, this sounds like it could be super important for SEO.

Right?

After all, Google is a massive index of information, and we’re hearing all kinds of things about semantic search and the importance of relevance in the search ranking algorithm.

If you’ve heard rumblings about latent semantic indexing in SEO or been advised to use LSI keywords, you aren’t alone.

But will LSI actually help improve your search rankings? Let’s take a look.

The Claim: Latent Semantic Indexing As A Ranking Factor

The claim is simple: Optimizing web content using LSI keywords helps Google better understand it and you’ll be rewarded with higher rankings.

Backlinko defines LSI keywords in this way:

“LSI (Latent Semantic Indexing) Keywords are conceptually related terms that search engines use to deeply understand content on a webpage.”

By using contextually related terms, you can deepen Google’s understanding of your content. Or so the story goes.

That resource goes on to make some pretty compelling arguments for LSI keywords:

  • Google relies on LSI keywords to understand content at such a deep level.”
  • LSI Keywords are NOT synonyms. Instead, they’re terms that are closely tied to your target keyword.”
  • Google doesn’t ONLY bold terms that exactly match what you just searched for (in search results). They also bold words and phrases that are similar. Needless to say, these are LSI keywords that you want to sprinkle into your content.”

Does this practice of “sprinkling” terms closely related to your target keyword help improve your rankings via LSI?

The Evidence For LSI As A Ranking Factor

Relevance is identified as one of five key factors that help Google determine which result is the best answer for any given query.

See also  Google SEO 101: Blocking Special Files in Robots.txt

As Google explains is its How Search Works resource:

“To return relevant results for your query, we first need to establish what information you’re looking forーthe intent behind your query.”

Once intent has been established:

“…algorithms analyze the content of webpages to assess whether the page contains information that might be relevant to what you are looking for.”

Google goes on to explain that the “most basic signal” of relevance is that the keywords used in the search query appear on the page. That makes sense – if you aren’t using the keywords the searcher is looking for, how could Google tell you’re the best answer?

Now, this is where some believe LSI comes into play.

If using keywords is a signal of relevance, using just the right keywords must be a stronger signal.

There are purpose-build tools dedicated to helping you find these LSI keywords, and believers in this tactic recommend using all kinds of other keyword research tactics to identify them, as well.

The Evidence Against LSI As A Ranking Factor

Google’s John Mueller has been crystal clear on this one:

“…we have no concept of LSI keywords. So that’s something you can completely ignore.”

There’s a healthy skepticism in SEO that Google may say things to lead us astray in order to protect the integrity of the algorithm. So let’s dig in here.

First, it’s important to understand what LSI is and where it came from.

Latent semantic structure emerged as a methodology for retrieving textual objects from files stored in a computer system in the late 1980s. As such, it’s an example of one of the earlier information retrieval (IR) concepts available to programmers.

As computer storage capacity improved and electronically available sets of data grew in size, it became more difficult to locate exactly what one was looking for in that collection.

Researchers described the problem they were trying to solve in a patent application filed September 15, 1988:

“Most systems still require a user or provider of information to specify explicit relationships and links between data objects or text objects, thereby making the systems tedious to use or to apply to large, heterogeneous computer information files whose content may be unfamiliar to the user.”

See also  IndexNow Now Sharing URLs Between Search Engines & IndexNow API

Keyword matching was being used in IR at the time, but its limitations were evident long before Google came along.

Too often, the words a person used to search for the information they sought were not exact matches for the words used in the indexed information.

There are two reasons for this:

  • Synonymy: the diverse range of words used to describe a single object or idea results in relevant results being missed.
  • Polysemy: the different meanings of a single word results in irrelevant results being retrieved.

These are still issues today, and you can imagine what a massive headache it is for Google.

However, the methodologies and technology Google uses to solve for relevance long ago moved on from LSI.

What LSI did was automatically create a “semantic space” for information retrieval.

As the patent explains, LSI treated this unreliability of association data as a statistical problem.

Without getting too into the weeds, these researchers essentially believed that there was a hidden underlying latent semantic structure they could tease out of word usage data.

Doing so would reveal the latent meaning and enable the system to bring back more relevant results – and only the most relevant results – even if there’s no exact keyword match.

Here’s what that LSI process actually looks like:

Image created by author, January 2022

And here’s the most important thing you should note about the above illustration of this methodology from the patent application: there are two separate processes happening.

First, the collection or index undergoes Latent Semantic Analysis.

Second, the query is analyzed and the already-processed index is then searched for similarities.

And that’s where the fundamental problem with LSI as a Google search ranking signal lies.

Google’s index is massive at hundreds of billions of pages, and it’s growing constantly.

See also  This Week in Apps: In-app events hit the App Store, TikTok tries Stories, Apple reveals new child safety plan

Each time a user inputs a query, Google is sorting through its index in a fraction of a second to find the best answer.

Using the above methodology in the algorithm would require that Google:

  1. Recreate that semantic space using LSA across its entire index.
  2. Analyze the semantic meaning of the query.
  3. Find all similarities between the semantic meaning of the query and documents in the semantic space created from analyzing the entire index.
  4. Sort and rank those results.

That’s a gross oversimplification, but the point is that this isn’t a scalable process.

This would be super useful for small collections of information. It was helpful for surfacing relevant reports inside a company’s computerized archive of technical documentation, for example.

The patent application illustrates how LSI works using a collection of nine documents. That’s what it was designed to do. LSI is primitive in terms of computerized information retrieval.

Latent Semantic Indexing As A Ranking Factor: Our Verdict

Latent Semantic Indexing (LSI): Is It A Google Ranking Factor?

While the underlying principles of eliminating noise by determining semantic relevance have surely informed developments in search ranking since LSA/LSI was patented, LSI itself has no useful application in SEO today.

It hasn’t been ruled out completely, but there is no evidence that Google has ever used LSI to rank results. And Google definitely isn’t using LSI or LSI keywords today to rank search results.

Those who recommend using LSI keywords are latching on to a concept they don’t quite understand in an effort to explain why the ways in which words are related (or not) is important in SEO.

Relevance and intent are foundational considerations in Google’s search ranking algorithm.

Those are two of the big questions they’re trying to solve for in surfacing the best answer for any query.

Synonymy and polysemy are still major challenges.

Semantics – that is, our understanding of the various meanings of words and how they’re related – is essential in producing more relevant search results.

But LSI has nothing to do with that.


Featured Image: Paulo Bobita/Search Engine Journal





Source link

Continue Reading

SEO

What Is a Google Broad Core Algorithm Update?

Published

on

What Is A Google Broad Core Algorithm Update?


When Google announces a broad core algorithm update, many SEO professionals find themselves asking what exactly changed (besides their rankings).

Google’s acknowledgment of core updates is always vague and doesn’t provide much detail other than to say the update occurred.

The SEO community is typically notified about core updates via the same standard tweets from Google’s Search Liaison.

There’s one announcement from Google when the update begins rolling out, and one on its conclusion, with few additional details in between (if any).

This invariably leaves SEO professionals and site owners asking many questions with respect to how their rankings were impacted by the core update.

To gain insight into what may have caused a site’s rankings to go up, down, or stay the same, it helps to understand what a broad core update is and how it differs from other types of algorithm updates.

After reading this article you’ll have a better idea of what a core update is designed to do, and how to recover from one if your rankings were impacted.

So, What Exactly Is A Core Update?

First, let me get the obligatory “Google makes hundreds of algorithm changes per year, often more than one per day” boilerplate out of the way.

Many of the named updates we hear about (Penguin, Panda, Pigeon, Fred, etc.) are implemented to address specific faults or issues in Google’s algorithms.

In the case of Penguin, it was link spam; in the case of Pigeon, it was local SEO spam.

They all had a specific purpose.

In these cases, Google (sometimes reluctantly) informed us what they were trying to accomplish or prevent with the algorithm update, and we were able to go back and remedy our sites.

A core update is different.

The way I understand it, a core update is a tweak or change to the main search algorithm itself.

You know, the one that has between 200 and 500 ranking factors and signals (depending on which SEO blog you’re reading today).

See also  Google Introduces Price Drop Rich Result

What a core update means to me is that Google slightly tweaked the importance, order, weights, or values of these signals.

Because of that, they can’t come right out and tell us what changed without revealing the secret sauce.

The simplest way to visualize this would be to imagine 200 factors listed in order of importance.

Now imagine Google changing the order of 42 of those 200 factors.

Rankings would change, but it would be a combination of many things, not due to one specific factor or cause.

Obviously, it isn’t that simple, but that’s a good way to think about a core update.

Here’s a purely made up, slightly more complicated example of what Google wouldn’t tell us:

“In this core update, we increased the value of keywords in H1 tags by 2%, increased the value of HTTPS by 18%, decreased the value of keyword in title tag by 9%, changed the D value in our PageRank calculation from .85 to .70, and started using a TF-iDUF retrieval method for logged in users instead of the traditional TF-PDF method.”

(I swear these are real things. I just have no idea if they’re real things used by Google.)

For starters, many SEO pros wouldn’t understand it.

Basically, it means Google may have changed the way they calculate term importance on a page, or the weighing of links in PageRank, or both, or a whole bunch of other factors that they can’t talk about (without giving away the algorithm).

Put simply: Google changed the weight and importance of many ranking factors.

That’s the simple explanation.

At its most complex form, Google ran a new training set through their machine learning ranking model and quality raters picked this new set of results as more relevant than the previous set, and the engineers have no idea what weights changed or how they changed because that’s just how machine learning works.

See also  Google FAQ Provides Core Web Vitals Insights

(We all know Google uses quality raters to rate search results. These ratings are how they choose one algorithm change over another – not how they rate your site. Whether they feed this into machine learning is anybody’s guess. But it’s one possibility.)

It’s likely some random combination of weighting delivered more relevant results for the quality raters, so they tested it more, the test results confirmed it, and they pushed it live.

How Can You Recover From A Core Update?

Unlike a major named update that targeted specific things, a core update may tweak the values of everything.

Because websites are weighted against other websites relevant to your query (engineers call this a corpus) the reason your site dropped could be entirely different than the reason somebody else’s increased or decreased in rankings.

To put it simply, Google isn’t telling you how to “recover” because it’s likely a different answer for every website and query.

It all depends on what everybody else trying to rank for your query is doing.

Does every one of them but you have their keyword in the H1 tag? If so then that could be a contributing factor.

Do you all do that already? Then that probably carries less weight for that corpus of results.

It’s very likely that this algorithm update didn’t “penalize” you for something at all. It most likely just rewarded another site more for something else.

Maybe you were killing it with internal anchor text and they were doing a great job of formatting content to match user intent – and Google shifted the weights so that content formatting was slightly higher and internal anchor text was slightly lower.

(Again, hypothetical examples here.)

In reality, it was probably several minor tweaks that, when combined, tipped the scales slightly in favor of one site or another (think of our reordered list here).

See also  10 Things You Need To Know To Be Successful

Finding that “something else” that is helping your competitors isn’t easy – but it’s what keeps SEO professionals in the business.

Next Steps And Action Items

Rankings are down after a core update – now what?

Your next step is to gather intel on the pages that are ranking where your site used to be.

Conduct a SERP analysis to find positive correlations between pages that are ranking higher for queries where your site is now lower.

Try not to overanalyze the technical details, such as how fast each page loads or what their core web vitals scores are.

Pay attention to the content itself. As you go through it, ask yourself questions like:

  • Does it provide a better answer to the query than your article?
  • Does the content contain more recent data and current stats than yours?
  • Are there pictures and videos that help bring the content to life for the reader?

Google aims to serve content that provides the best and most complete answers to searchers’ queries. Relevance is the one ranking factor that will always win out over all others.

Take an honest look at your content to see if it’s as relevant today as it was prior to the core algorithm update.

From there you’ll have an idea of what needs improvement.

The best advice for conquering core updates?

Keep focusing on:

  • User intent.
  • Quality content.
  • Clean architecture.
  • Google’s guidelines.

Finally, don’t stop improving your site once you reach Position 1, because the site in Position 2 isn’t going to stop.

Yeah, I know, it’s not the answer anybody wants and it sounds like Google propaganda. I swear it’s not.

It’s just the reality of what a core update is.

Nobody said SEO was easy.

More resources:


Featured Image: Ulvur/Shutterstock





Source link

Continue Reading

DON'T MISS ANY IMPORTANT NEWS!
Subscribe To our Newsletter
We promise not to spam you. Unsubscribe at any time.
Invalid email address

Trending