Connect with us


Bulk Loading Performance Tests With PageSpeed Insights API & Python



Bulk Loading Performance Tests With PageSpeed Insights API & Python

Google offers PageSpeed Insights API to help SEO pros and developers by mixing real-world data with simulation data,  providing load performance timing data related to web pages.

The difference between the Google PageSpeed Insights (PSI) and Lighthouse is that PSI involves both real-world and lab data, while Lighthouse performs a page loading simulation by modifying the connection and user-agent of the device.

Another point of difference is that PSI doesn’t supply any information related to web accessibility, SEO, or progressive web apps (PWAs), while Lighthouse provides all of the above.

Thus, when we use PageSpeed Insights API for the bulk URL loading performance test, we won’t have any data for accessibility.

However, PSI provides more information related to the page speed performance, such as “DOM Size,” “Deepest DOM Child Element,” “Total Task Count,” and “DOM Content Loaded” timing.

One more advantage of the PageSpeed Insights API is that it gives the “observed metrics” and “actual metrics” different names.

In this guide, you will learn:

  • How to create a production-level Python Script.
  • How to use APIs with Python.
  • How to construct data frames from API responses.
  • How to analyze the API responses.
  • How to parse URLs and process URL requests’ responses.
  • How to store the API responses with proper structure.

An example output of the Page Speed Insights API call with Python is below.

Screenshot from author, June 2022

Libraries For Using PageSpeed Insights API With Python

The necessary libraries to use PSI API with Python are below.

  • Advertools retrieves testing URLs from the sitemap of a website.
  • Pandas is to construct the data frame and flatten the JSON output of the API.
  • Requests are to make a request to the specific API endpoint.
  • JSON is to take the API response and put it into the specifically related dictionary point.
  • Datetime is to modify the specific output file’s name with the date of the moment.
  • URLlib is to parse the test subject website URL.

How To Use PSI API With Python?

To use the PSI API with Python, follow the steps below.

  • Get a PageSpeed Insights API key.
  • Import the necessary libraries.
  • Parse the URL for the test subject website.
  • Take the Date of Moment for file name.
  • Take URLs into a list from a sitemap.
  • Choose the metrics that you want from PSI API.
  • Create a For Loop for taking the API Response for all URLs.
  • Construct the data frame with chosen PSI API metrics.
  • Output the results in the form of XLSX.

1. Get PageSpeed Insights API Key

Use the PageSpeed Insights API Documentation to get the API Key.

Click the “Get a Key” button below.

psi api key Image from, June 2022

Choose a project that you have created in Google Developer Console.

google developer console api projectImage from, June 2022

Enable the PageSpeed Insights API on that specific project.

page speed insights api enableImage from, June 2022

You will need to use the specific API Key in your API Requests.

2. Import The Necessary Libraries

Use the lines below to import the fundamental libraries.

    import advertools as adv
    import pandas as pd
    import requests
    import json
    from datetime import datetime
    from urllib.parse import urlparse

3. Parse The URL For The Test Subject Website

To parse the URL of the subject website, use the code structure below.

  domain = urlparse(sitemap_url)
  domain = domain.netloc.split(".")[1]

The “domain” variable is the parsed version of the sitemap URL.

The “netloc” represents the specific URL’s domain section. When we split it with the “.” it takes the “middle section” which represents the domain name.

Here, “0” is for “www,” “1” for “domain name,” and “2” is for “domain extension,” if we split it with “.”

4. Take The Date Of Moment For File Name

To take the date of the specific function call moment, use the “” method. provides the specific time of the specific moment. Use the “strftime” with the “%Y”, “”%m”, and “%d” values. “%Y” is for the year. The “%m” and “%d” are numeric values for the specific month and the day.

 date ="%Y_%m_%d")

5. Take URLs Into A List From A Sitemap

To take the URLs into a list form from a sitemap file, use the code block below.

   sitemap = adv.sitemap_to_df(sitemap_url)
   sitemap_urls = sitemap["loc"].to_list()

If you read the Python Sitemap Health Audit, you can learn further information about the sitemaps.

6. Choose The Metrics That You Want From PSI API

To choose the PSI API response JSON properties, you should see the JSON file itself.

It is highly relevant to the reading, parsing, and flattening of JSON objects.

It is even related to Semantic SEO, thanks to the concept of “directed graph,” and “JSON-LD” structured data.

In this article, we won’t focus on examining the specific PSI API Response’s JSON hierarchies.

You can see the metrics that I have chosen to gather from PSI API. It is richer than the basic default output of PSI API, which only gives the Core Web Vitals Metrics, or Speed Index-Interaction to Next Paint, Time to First Byte, and First Contentful Paint.

Of course, it also gives “suggestions” by saying “Avoid Chaining Critical Requests,” but there is no need to put a sentence into a data frame.

In the future, these suggestions, or even every individual chain event, their KB and MS values can be taken into a single column with the name “psi_suggestions.”

For a start, you can check the metrics that I have chosen, and an important amount of them will be first for you.

PSI API Metrics, the first section is below.

    fid = []
    lcp = []
    cls_ = []
    url = []
    fcp = []
    performance_score = []
    total_tasks = []
    total_tasks_time = []
    long_tasks = []
    dom_size = []
    maximum_dom_depth = []
    maximum_child_element = []
    observed_fcp  = []
    observed_fid = []
    observed_lcp = []
    observed_cls = []
    observed_fp = []
    observed_fmp = []
    observed_dom_content_loaded = []
    observed_speed_index = []
    observed_total_blocking_time = []
    observed_first_visual_change = []
    observed_last_visual_change = []
    observed_tti = []
    observed_max_potential_fid = []

This section includes all the observed and simulated fundamental page speed metrics, along with some non-fundamental ones, like “DOM Content Loaded,” or “First Meaningful Paint.”

The second section of PSI Metrics focuses on possible byte and time savings from the unused code amount.

    render_blocking_resources_ms_save = []
    unused_javascript_ms_save = []
    unused_javascript_byte_save = []
    unused_css_rules_ms_save = []
    unused_css_rules_bytes_save = []

A third section of the PSI metrics focuses on server response time, responsive image usage benefits, or not, using harms.

    possible_server_response_time_saving = []
    possible_responsive_image_ms_save = []

Note: Overall Performance Score comes from “performance_score.”

7. Create A For Loop For Taking The API Response For All URLs

The for loop is to take all of the URLs from the sitemap file and use the PSI API for all of them one by one. The for loop for PSI API automation has several sections.

The first section of the PSI API for loop starts with duplicate URL prevention.

In the sitemaps, you can see a URL that appears multiple times. This section prevents it.

for i in sitemap_urls[:9]:
         # Prevent the duplicate "/" trailing slash URL requests to override the information.
         if i.endswith("/"):
               r = requests.get(f"{i}&strategy=mobile&locale=en&key={api_key}")
               r = requests.get(f"{i}/&strategy=mobile&locale=en&key={api_key}")

Remember to check the “api_key” at the end of the endpoint for PageSpeed Insights API.

Check the status code. In the sitemaps, there might be non-200 status code URLs; these should be cleaned.

         if r.status_code == 200:
               data_ = json.loads(r.text)

The next section appends the specific metrics to the specific dictionary that we have created before “_data.”

               performance_score.append(data_["lighthouseResult"]["categories"]["performance"]["score"] * 100)

Next section focuses on “total task” count, and DOM Size.


The next section takes the “DOM Depth” and “Deepest DOM Element.”


The next section takes the specific observed test results during our Page Speed Insights API.


The next section takes the Unused Code amount and the wasted bytes, in milliseconds along with the render-blocking resources.


The next section is to provide responsive image benefits and server response timing.


The next section is to make the function continue to work in case there is an error.


Example Usage Of Page Speed Insights API With Python For Bulk Testing

To use the specific code blocks, put them into a Python function.

Run the script, and you will get 29 page speed-related metrics in the columns below.

pagespeed insights apiScreenshot from author, June 2022


PageSpeed Insights API provides different types of page loading performance metrics.

It demonstrates how Google engineers perceive the concept of page loading performance, and possibly use these metrics as a ranking, UX, and quality-understanding point of view.

Using Python for bulk page speed tests gives you a snapshot of the entire website to help analyze the possible user experience, crawl efficiency, conversion rate, and ranking improvements.

More resources:

Featured Image: Dundanim/Shutterstock

Source link


A Comprehensive Guide To Marketing Attribution Models



A Comprehensive Guide To Marketing Attribution Models

We all know that customers interact with a brand through multiple channels and campaigns (online and offline) along their path to conversion.

Surprisingly, within the B2B sector, the average customer is exposed to a brand 36 times before converting into a customer.

With so many touchpoints, it is difficult to really pin down just how much a marketing channel or campaign influenced the decision to buy.

This is where marketing attribution comes in.

Marketing attribution provides insights into the most effective touchpoints along the buyer journey.

In this comprehensive guide, we simplify everything you need to know to get started with marketing attribution models, including an overview of your options and how to use them.

What Is Marketing Attribution?

Marketing attribution is the rule (or set of rules) that says how the credit for a conversion is distributed across a buyer’s journey.

How much credit each touchpoint should get is one of the more complicated marketing topics, which is why so many different types of attribution models are used today.

6 Common Attribution Models

There are six common attribution models, and each distributes conversion value across the buyer’s journey differently.

Don’t worry. We will help you understand all of the models below so you can decide which is best for your needs.

Note: The examples in this guide use Google Analytics 4 cross-channel rules-based models.

Cross-channel rules-based means that it ignores direct traffic. This may not be the case if you use alternative analytics software.

1. Last Click

The last click attribution model gives all the credit to the marketing touchpoint that happens directly before conversion.

Last Click helps you understand which marketing efforts close sales.

For example, a user initially discovers your brand by watching a YouTube Ad for 30 seconds (engaged view).

Later that day, the same user Googles your brand and clicks through an organic search result.

The following week this user is shown a retargeting ad on Facebook, clicks through, and signs up for your email newsletter.

The next day, they click through the email and convert to a customer.

Under a last-click attribution model, 100% of the credit for that conversion is given to email, the touchpoint that closed the sale.

2. First Click

The first click is the opposite of the last click attribution model.

All of the credit for any conversion that may happen is awarded to the first interaction.

The first click helps you to understand which channels create brand awareness.

It doesn’t matter if the customer clicked through a retargeting ad and later converted through an email visit.

If the customer initially interacted with your brand through an engaged YouTube view, Paid Video gets full credit for that conversion because it started the journey.

3. Linear

Linear attribution provides a look at your marketing strategy as a whole.

This model is especially useful if you need to maintain awareness throughout the entire buyer journey.

Credit for conversion is split evenly among all the channels a customer interacts with.

Let’s look at our example: Each of the four touchpoints (Paid Video, Organic, Paid Social, and Email) all get 25% of the conversion value because they’re all given equal credit.

4. Time Decay

Time Decay is useful for short sales cycles like a promotion because it considers when each touchpoint occurred.

The first touch gets the least amount of credit, while the last click gets the most.

Using our example:

  • Paid Video (YouTube engaged view) would get 10% of the credit.
  • Organic search would get 20%.
  • Paid Social (Facebook ad) gets 30%.
  • Email, which occurred the day of the conversion, gets 40%.

Note: Google Analytics 4 distributes this credit using a seven-day half-life.

5. Position-Based

The position-based (U-shaped) approach divides credit for a sale between the two most critical interactions: how a client discovered your brand and the interaction that generated a conversion.

With position-based attribution modeling, Paid Video (YouTube engaged view) and Email would each get 40% of the credit because they were the first and last interaction within our example.

Organic search and the Facebook Ad would each get 10%.

6. Data-Driven (Cross-Channel Linear)

Google Analytics 4 has a unique data-driven attribution model that uses machine learning algorithms.

Credit is assigned based on how each touchpoint changes the estimated conversion probability.

It uses each advertiser’s data to calculate the actual contribution an interaction had for every conversion event.

Best Marketing Attribution Model

There isn’t necessarily a “best” marketing attribution model, and there’s no reason to limit yourself to just one.

Comparing performance under different attribution models will help you to understand the importance of multiple touchpoints along your buyer journey.

Model Comparison In Google Analytics 4 (GA4)

If you want to see how performance changes by attribution model, you can do that easily with GA4.

To access model comparison in Google Analytics 4, click “Advertising” in the left-hand menu and then click “Model comparison” under “Attribution.”

Screenshot from GA4, July 2022

By default, the conversion events will be all, the date range will be the last 28 days, and the dimension will be the default channel grouping.

Start by selecting the date range and conversion event you want to analyze.

GA4 model comparison_choose event and date rangeScreenshot from GA4, July 2022

You can add a filter to view a specific campaign, geographic location, or device using the edit comparison option in the top right of the report.

GA4 Model comparison filterScreenshot from GA4, July 2022

Select the dimension to report on and then use the drown-down menus to select the attribution models to compare.

GA4 model comparison_select dimensionScreenshot from GA4, July 2022

GA4 Model Comparison Example

Let’s say you’re asked to increase new customers to the website.

You could open Google Analytics 4 and compare the “last-click” model to the “first-click” model to discover which marketing efforts start customers down the path to conversion.

GA4 model comparison_increase new customersScreenshot from GA4, July 2022

In the example above, we may choose to look further into the email and paid search further because they appear to be more effective at starting customers down the path to conversion than closing the sale.

How To Change Google Analytics 4 Attribution Model

If you choose a different attribution model for your company, you can edit your attribution settings by clicking the gear icon in the bottom left-hand corner.

Open Attribution Settings under the property column and click the Reporting attribution model drop-down menu.

Here you can choose from the six cross-channel attribution models discussed above or the “ads-preferred last click model.”

Ads-preferred gives full credit to the last Google Ads click along the conversion path.

edit GA4 attribution settingsScreenshot from GA4, July 2022

Please note that attribution model changes will apply to historical and future data.

Final Thoughts

Determining where and when a lead or purchase occurred is easy. The hard part is defining the reason behind a lead or purchase.

Comparing attribution modeling reports help us to understand how the entire buyer journey supported the conversion.

Looking at this information in greater depth enables marketers to maximize ROI.

Got questions? Let us know on Twitter or Linkedin.

More Resources:

Featured Image: Andrii Yalanskyi/Shutterstock

Source link

Continue Reading

Subscribe To our Newsletter
We promise not to spam you. Unsubscribe at any time.
Invalid email address