Connect with us


Competitor Backlink Analysis With Python [Complete Script]



In my last article, we analyzed our backlinks using data from Ahrefs.

This time around, we’re including the competitor backlinks in our analysis using the same Ahrefs data source for comparison.

Like last time, we defined the value of a site’s backlinks for SEO as a product of quality and quantity.

Quality is domain authority (or Ahrefs’ equivalent domain rating) and quantity is the number of referring domains.

Again, we’ll evaluate the link quality with the available data before evaluating the quantity.

Time to code.

import re
import time
import random
import pandas as pd
import numpy as np
import datetime
from datetime import timedelta
from plotnine import *
import matplotlib.pyplot as plt
from pandas.api.types import is_string_dtype
from pandas.api.types import is_numeric_dtype
import uritools  

pd.set_option('display.max_colwidth', None)
%matplotlib inline
root_domain = ''
hostdomain = ''
full_domain = ''
target_name="John Sankey"

Data Import & Cleaning

We set up the file directories to read multiple Ahrefs exported data files in one folder, which is much faster, less boring, and more efficient than reading each file individually.

Especially when you have more than 10 of them!


The listdir( ) function from the OS module allows us to list all files in a subdirectory.

ahrefs_filenames = os.listdir(ahrefs_path)

File names now listed below:


With the files listed, we’ll now read each one individually using a for loop, and add these to a dataframe.

While reading in the file we’ll use some string manipulation to create a new column with the site name of the data we’re importing.

ahrefs_df_lst = list()
ahrefs_colnames = list()

for filename in ahrefs_filenames:
    df = pd.read_csv(ahrefs_path + filename)
    df['site'] = filename
    df['site'] = df['site'].str.replace('www.', '', regex = False)    
    df['site'] = df['site'].str.replace('.csv', '', regex = False)
    df['site'] = df['site'].str.replace('-.+', '', regex = True)

ahrefs_df_raw = pd.concat(ahrefs_df_lst)
ahrefs dofollow raw data

Image from Ahrefs, May 2022

Now we have the raw data from each site in a single dataframe. The next step is to tidy up the column names and make them a bit friendlier to work with.

Although the repetition could be eliminated with a custom function or a list comprehension, it is good practice and easier for beginner SEO Pythonistas to see what’s happening step by step. As they say, “repetition is the mother of mastery,” so get practicing!

competitor_ahrefs_cleancols = ahrefs_df_raw
competitor_ahrefs_cleancols.columns = [col.lower() for col in competitor_ahrefs_cleancols.columns]
competitor_ahrefs_cleancols.columns = [col.replace(' ','_') for col in competitor_ahrefs_cleancols.columns]
competitor_ahrefs_cleancols.columns = [col.replace('.','_') for col in competitor_ahrefs_cleancols.columns]
competitor_ahrefs_cleancols.columns = [col.replace('__','_') for col in competitor_ahrefs_cleancols.columns]
competitor_ahrefs_cleancols.columns = [col.replace('(','') for col in competitor_ahrefs_cleancols.columns]
competitor_ahrefs_cleancols.columns = [col.replace(')','') for col in competitor_ahrefs_cleancols.columns]
competitor_ahrefs_cleancols.columns = [col.replace('%','') for col in competitor_ahrefs_cleancols.columns]

The count column and having a single value column (‘project’) are useful for groupby and aggregation operations.

competitor_ahrefs_cleancols['rd_count'] = 1
competitor_ahrefs_cleancols['project'] = target_name

Ahrefs competitor dataImage from Ahrefs, May 2022Ahrefs competitor data

The columns are cleaned up, so now we’ll clean up the row data.

competitor_ahrefs_clean_dtypes = competitor_ahrefs_cleancols

For referring domains, we’re replacing hyphens with zero and setting the data type as an integer (i.e., whole number).

This will be repeated for linked domains, also.

competitor_ahrefs_clean_dtypes['dofollow_ref_domains'] = np.where(competitor_ahrefs_clean_dtypes['dofollow_ref_domains'] == '-',
                                                           0, competitor_ahrefs_clean_dtypes['dofollow_ref_domains'])
competitor_ahrefs_clean_dtypes['dofollow_ref_domains'] = competitor_ahrefs_clean_dtypes['dofollow_ref_domains'].astype(int)

# linked_domains

competitor_ahrefs_clean_dtypes['dofollow_linked_domains'] = np.where(competitor_ahrefs_clean_dtypes['dofollow_linked_domains'] == '-',
                                                           0, competitor_ahrefs_clean_dtypes['dofollow_linked_domains'])
competitor_ahrefs_clean_dtypes['dofollow_linked_domains'] = competitor_ahrefs_clean_dtypes['dofollow_linked_domains'].astype(int)


First seen gives us a date point at which links were found, which we can use for time series plotting and deriving the link age.

We’ll convert to date format using the to_datetime function.

# first_seen
competitor_ahrefs_clean_dtypes['first_seen'] = pd.to_datetime(competitor_ahrefs_clean_dtypes['first_seen'], 
                                                              format="%d/%m/%Y %H:%M")
competitor_ahrefs_clean_dtypes['first_seen'] = competitor_ahrefs_clean_dtypes['first_seen'].dt.normalize()
competitor_ahrefs_clean_dtypes['month_year'] = competitor_ahrefs_clean_dtypes['first_seen'].dt.to_period('M')

To calculate the link_age we’ll simply deduct the first seen date from today’s date and convert the difference into a number.

# link age
competitor_ahrefs_clean_dtypes['link_age'] = - competitor_ahrefs_clean_dtypes['first_seen']
competitor_ahrefs_clean_dtypes['link_age'] = competitor_ahrefs_clean_dtypes['link_age']
competitor_ahrefs_clean_dtypes['link_age'] = competitor_ahrefs_clean_dtypes['link_age'].astype(int)
competitor_ahrefs_clean_dtypes['link_age'] = (competitor_ahrefs_clean_dtypes['link_age']/(3600 * 24 * 1000000000)).round(0)

The target column helps us distinguish the “client” site vs competitors which is useful for visualization later.

competitor_ahrefs_clean_dtypes['target'] = np.where(competitor_ahrefs_clean_dtypes['site'].str.contains('johns'),
                                                                                            1, 0)
competitor_ahrefs_clean_dtypes['target'] = competitor_ahrefs_clean_dtypes['target'].astype('category')

Ahrefs clean data typesImage from Ahrefs, May 2022Ahrefs clean data types

Now that the data is cleaned up both in terms of column titles and row values we’re ready to set forth and start analyzing.

Link Quality

We start with Link Quality which we’ll accept Domain Rating (DR) as the measure.

Let’s start by inspecting the distributive properties of DR by plotting their distribution using the geom_bokplot function.

comp_dr_dist_box_plt = (
    ggplot(competitor_ahrefs_analysis.loc[competitor_ahrefs_analysis['dr'] > 0], 
           aes(x = 'reorder(site, dr)', y = 'dr', colour="target")) + 
    geom_boxplot(alpha = 0.6) +
    scale_y_continuous() +   
    theme(legend_position = 'none', 
          axis_text_x=element_text(rotation=90, hjust=1)
                           height=5, width=10, units="in", dpi=1000)
competition distribution typesImage from Ahrefs, May 2022competition distribution types

The plot compares the site’s statistical properties side by side, and most notably, the interquartile range showing where most referring domains fall in terms of domain rating.

We also see that John Sankey has the fourth-highest median domain rating, which compares well with link quality against other sites.

William Garvey has the most diverse range of DR compared with other domains, indicating ever so slightly more relaxed criteria for link acquisition. Who knows.

Link Volumes

That’s quality. What about the volume of links from referring domains?

To tackle that, we’ll compute a running sum of referring domains using the groupby function.

competitor_count_cumsum_df = competitor_ahrefs_analysis

competitor_count_cumsum_df = competitor_count_cumsum_df.groupby(['site', 'month_year'])['rd_count'].sum().reset_index()

The expanding function allows the calculation window to grow with the number of rows which is how we achieve our running sum.

competitor_count_cumsum_df['count_runsum'] = competitor_count_cumsum_df['rd_count'].expanding().sum()

Ahrefs cumulative sum dataImage from Ahrefs, May 2022Ahrefs cumulative sum data

The result is a data frame with the site, month_year and count_runsum (the running sum), which is in the perfect format to feed the graph.

competitor_count_cumsum_plt = (
    ggplot(competitor_count_cumsum_df, aes(x = 'month_year', y = 'count_runsum', 
                                           group = 'site', colour="site")) + 
    geom_line(alpha = 0.6, size = 2) +
    labs(y = 'Running Sum of Referring Domains', x = 'Month Year') + 
    scale_y_continuous() + 
    scale_x_date() +
    theme(legend_position = 'right', 
          axis_text_x=element_text(rotation=90, hjust=1)
                           height=5, width=10, units="in", dpi=1000)

competitor graph Image from Ahrefs, May 2022competitor graph

The plot shows the number of referring domains for each site since 2014.

I find quite interesting the different starting positions for each site when they start acquiring links.

For example, William Garvey started with over 5,000 domains. I’d love to know who their PR agency is!

We can also see the rate of growth. For example, although Hadley Rose started link acquisition in 2018, things really took off around mid-2021.

More, More, And More

You can always do more scientific analysis.

For example, one immediate and natural extension of the above would be to combine both the quality (DR) and the quantity (volume) for a more holistic view of how the sites compare in terms of offsite SEO.

Other extensions would be to model the qualities of those referring domains for both your own and your competitor sites to see which link features (such as the number of words or relevance of the linking content) could explain the difference in visibility between you and your competitors.

This model extension would be a good application of these machine learning techniques.

More resources:

Featured Image: F8 studio/Shutterstock


if( typeof sopp !== “undefined” && sopp === ‘yes’ ){
fbq(‘dataProcessingOptions’, [‘LDU’], 1, 1000);
fbq(‘dataProcessingOptions’, []);

fbq(‘init’, ‘1321385257908563’);

fbq(‘track’, ‘PageView’);

fbq(‘trackSingle’, ‘1321385257908563’, ‘ViewContent’, {
content_name: ‘competitor-backlinks-python’,
content_category: ‘linkbuilding marketing-analytics seo’

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address


4 Ways To Try The New Model From Mistral AI




4 Ways To Try The New Model From Mistral AI

In a significant leap in large language model (LLM) development, Mistral AI announced the release of its newest model, Mixtral-8x7B.

What Is Mixtral-8x7B?

Mixtral-8x7B from Mistral AI is a Mixture of Experts (MoE) model designed to enhance how machines understand and generate text.

Imagine it as a team of specialized experts, each skilled in a different area, working together to handle various types of information and tasks.

A report published in June reportedly shed light on the intricacies of OpenAI’s GPT-4, highlighting that it employs a similar approach to MoE, utilizing 16 experts, each with around 111 billion parameters, and routes two experts per forward pass to optimize costs.

This approach allows the model to manage diverse and complex data efficiently, making it helpful in creating content, engaging in conversations, or translating languages.

Mixtral-8x7B Performance Metrics

Mistral AI’s new model, Mixtral-8x7B, represents a significant step forward from its previous model, Mistral-7B-v0.1.

It’s designed to understand better and create text, a key feature for anyone looking to use AI for writing or communication tasks.

This latest addition to the Mistral family promises to revolutionize the AI landscape with its enhanced performance metrics, as shared by OpenCompass.

Mixtral-8x7B: 4 Ways To Try The New Model From Mistral AI

What makes Mixtral-8x7B stand out is not just its improvement over Mistral AI’s previous version, but the way it measures up to models like Llama2-70B and Qwen-72B.

mixtral-8x7b performance metrics compared to llama 2 open source ai modelsmixtral-8x7b performance metrics compared to llama 2 open source ai models

It’s like having an assistant who can understand complex ideas and express them clearly.

One of the key strengths of the Mixtral-8x7B is its ability to handle specialized tasks.

For example, it performed exceptionally well in specific tests designed to evaluate AI models, indicating that it’s good at general text understanding and generation and excels in more niche areas.

This makes it a valuable tool for marketing professionals and SEO experts who need AI that can adapt to different content and technical requirements.

The Mixtral-8x7B’s ability to deal with complex math and coding problems also suggests it can be a helpful ally for those working in more technical aspects of SEO, where understanding and solving algorithmic challenges are crucial.

This new model could become a versatile and intelligent partner for a wide range of digital content and strategy needs.

How To Try Mixtral-8x7B: 4 Demos

You can experiment with Mistral AI’s new model, Mixtral-8x7B, to see how it responds to queries and how it performs compared to other open-source models and OpenAI’s GPT-4.

Please note that, like all generative AI content, platforms running this new model may produce inaccurate information or otherwise unintended results.

User feedback for new models like this one will help companies like Mistral AI improve future versions and models.

1. Perplexity Labs Playground

In Perplexity Labs, you can try Mixtral-8x7B along with Meta AI’s Llama 2, Mistral-7b, and Perplexity’s new online LLMs.

In this example, I ask about the model itself and notice that new instructions are added after the initial response to extend the generated content about my query.

mixtral-8x7b perplexity labs playgroundScreenshot from Perplexity, December 2023mixtral-8x7b perplexity labs playground

While the answer looks correct, it begins to repeat itself.

mixtral-8x7b errorsScreenshot from Perplexity Labs, December 2023mixtral-8x7b errors

The model did provide an over 600-word answer to the question, “What is SEO?”

Again, additional instructions appear as “headers” to seemingly ensure a comprehensive answer.

what is seo by mixtral-8x7bScreenshot from Perplexity Labs, December 2023what is seo by mixtral-8x7b

2. Poe

Poe hosts bots for popular LLMs, including OpenAI’s GPT-4 and DALL·E 3, Meta AI’s Llama 2 and Code Llama, Google’s PaLM 2, Anthropic’s Claude-instant and Claude 2, and StableDiffusionXL.

These bots cover a wide spectrum of capabilities, including text, image, and code generation.

The Mixtral-8x7B-Chat bot is operated by Fireworks AI.

poe bot for mixtral-8x7b firebaseScreenshot from Poe, December 2023poe bot for mixtral-8x7b firebase

It’s worth noting that the Fireworks page specifies it is an “unofficial implementation” that was fine-tuned for chat.

When asked what the best backlinks for SEO are, it provided a valid answer.

mixtral-8x7b poe best backlinks responseScreenshot from Poe, December 2023mixtral-8x7b poe best backlinks response

Compare this to the response offered by Google Bard.

Mixtral-8x7B: 4 Ways To Try The New Model From Mistral AIMixtral-8x7B: 4 Ways To Try The New Model From Mistral AI

Mixtral-8x7B: 4 Ways To Try The New Model From Mistral AIScreenshot from Google Bard, December 2023Mixtral-8x7B: 4 Ways To Try The New Model From Mistral AI

3. Vercel

Vercel offers a demo of Mixtral-8x7B that allows users to compare responses from popular Anthropic, Cohere, Meta AI, and OpenAI models.

vercel mixtral-8x7b demo compare gpt-4Screenshot from Vercel, December 2023vercel mixtral-8x7b demo compare gpt-4

It offers an interesting perspective on how each model interprets and responds to user questions.

mixtral-8x7b vs cohere on best resources for learning seoScreenshot from Vercel, December 2023mixtral-8x7b vs cohere on best resources for learning seo

Like many LLMs, it does occasionally hallucinate.

mixtral-8x7b hallucinationsScreenshot from Vercel, December 2023mixtral-8x7b hallucinations

4. Replicate

The mixtral-8x7b-32 demo on Replicate is based on this source code. It is also noted in the README that “Inference is quite inefficient.”

Mixtral-8x7B: 4 Ways To Try The New Model From Mistral AIScreenshot from Replicate, December 2023Mixtral-8x7B: 4 Ways To Try The New Model From Mistral AI

In the example above, Mixtral-8x7B describes itself as a game.


Mistral AI’s latest release sets a new benchmark in the AI field, offering enhanced performance and versatility. But like many LLMs, it can provide inaccurate and unexpected answers.

As AI continues to evolve, models like the Mixtral-8x7B could become integral in shaping advanced AI tools for marketing and business.

Featured image: T. Schneider/Shutterstock

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading


OpenAI Investigates ‘Lazy’ GPT-4 Complaints On Google Reviews, X




OpenAI Investigates 'Lazy' GPT-4 Complaints On Google Reviews, X

OpenAI, the company that launched ChatGPT a little over a year ago, has recently taken to social media to address concerns regarding the “lazy” performance of GPT-4 on social media and Google Reviews.

Screenshot from X, December 2023OpenAI Investigates ‘Lazy’ GPT-4 Complaints On Google Reviews, X

This move comes after growing user feedback online, which even includes a one-star review on the company’s Google Reviews.

OpenAI Gives Insight Into Training Chat Models, Performance Evaluations, And A/B Testing

OpenAI, through its @ChatGPTapp Twitter account, detailed the complexities involved in training chat models.

chatgpt openai a/b testingScreenshot from X, December 2023chatgpt openai a/b testing

The organization highlighted that the process is not a “clean industrial process” and that variations in training runs can lead to noticeable differences in the AI’s personality, creative style, and political bias.

Thorough AI model testing includes offline evaluation metrics and online A/B tests. The final decision to release a new model is based on a data-driven approach to improve the “real” user experience.

OpenAI’s Google Review Score Affected By GPT-4 Performance, Billing Issues

This explanation comes after weeks of user feedback about GPT-4 becoming worse on social media networks like X.

Complaints also appeared in OpenAI’s community forums.

openai community forums gpt-4 user feedbackScreenshot from OpenAI, December 2023openai community forums gpt-4 user feedback

The experience led one user to leave a one-star rating for OpenAI via Google Reviews. Other complaints regarded accounts, billing, and the artificial nature of AI.

openai google reviews star rating Screenshot from Google Reviews, December 2023openai google reviews star rating

A recent user on Product Hunt gave OpenAI a rating that also appears to be related to GPT-4 worsening.

openai reviewsScreenshot from Product Hunt, December 2023openai reviews

GPT-4 isn’t the only issue that local reviewers complain about. On Yelp, OpenAI has a one-star rating for ChatGPT 3.5 performance.

The complaint:

yelp openai chatgpt reviewScreenshot from Yelp, December 2023yelp openai chatgpt review

In related OpenAI news, the review with the most likes aligns with recent rumors about a volatile workplace, alleging that OpenAI is a “Cutthroat environment. Not friendly. Toxic workers.”

google review for openai toxic workersScreenshot from Google Reviews, December 2023google review for openai toxic workers

The reviews voted the most helpful on Glassdoor about OpenAI suggested that employee frustration and product development issues stem from the company’s shift in focus on profits.

openai employee review on glassdooropenai employee review on glassdoor

openai employee reviewsScreenshots from Glassdoor, December 2023openai employee reviews

This incident provides a unique outlook on how customer and employee experiences can impact any business through local reviews and business ratings platforms.

openai inc google business profile local serps google reviewsScreenshot from Google, December 2023openai inc google business profile local serps google reviews

Google SGE Highlights Positive Google Reviews

In addition to occasional complaints, Google reviewers acknowledged the revolutionary impact of OpenAI’s technology on various fields.

The most positive review mentions about the company appear in Google SGE (Search Generative Experience).

Google SGE response on OpenAIScreenshot from Google SGE, December 2023Google SGE response on OpenAI


OpenAI’s recent insights into training chat models and response to public feedback about GPT-4 performance illustrate AI technology’s dynamic and evolving nature and its impact on those who depend on the AI platform.

Especially the people who just received an invitation to join ChatGPT Plus after being waitlisted while OpenAI paused new subscriptions and upgrades. Or those developing GPTs for the upcoming GPT Store launch.

As AI advances, professionals in these fields must remain agile, informed, and responsive to technological developments and the public’s reception of these advancements.

Featured image: Tada Images/Shutterstock

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading


ChatGPT Plus Upgrades Paused; Waitlisted Users Receive Invites




ChatGPT Plus Upgrades Paused; Waitlisted Users Receive Invites

ChatGPT Plus subscriptions and upgrades remain paused after a surge in demand for new features created outages.

Some users who signed up for the waitlist have received invites to join ChatGPT Plus.

Screenshot from Gmail, December 2023ChatGPT Plus Upgrades Paused; Waitlisted Users Receive Invites

This has resulted in a few shares of the link that is accessible for everyone. For now.

RELATED: GPT Store Set To Launch In 2024 After ‘Unexpected’ Delays

In addition to the invites, signs that more people are getting access to GPTs include an introductory screen popping up on free ChatGPT accounts.

ChatGPT Plus Upgrades Paused; Waitlisted Users Receive InvitesScreenshot from ChatGPT, December 2023ChatGPT Plus Upgrades Paused; Waitlisted Users Receive Invites

Unfortunately, they still aren’t accessible without a Plus subscription.

chatgpt plus subscriptions upgrades paused waitlistScreenshot from ChatGPT, December 2023chatgpt plus subscriptions upgrades paused waitlist

You can sign up for the waitlist by clicking on the option to upgrade in the left sidebar of ChatGPT on a desktop browser.

ChatGPT Plus Upgrades Paused; Waitlisted Users Receive InvitesScreenshot from ChatGPT, December 2023ChatGPT Plus Upgrades Paused; Waitlisted Users Receive Invites

OpenAI also suggests ChatGPT Enterprise for those who need more capabilities, as outlined in the pricing plans below.

ChatGPT Plus Upgrades Paused; Waitlisted Users Receive InvitesScreenshot from OpenAI, December 2023ChatGPT Plus Upgrades Paused; Waitlisted Users Receive Invites

Why Are ChatGPT Plus Subscriptions Paused?

According to a post on X by OpenAI’s CEO Sam Altman, the recent surge in usage following the DevDay developers conference has led to capacity challenges, resulting in the decision to pause ChatGPT Plus signups.

The decision to pause new ChatGPT signups follows a week where OpenAI services – including ChatGPT and the API – experienced a series of outages related to high-demand and DDoS attacks.

Demand for ChatGPT Plus resulted in eBay listings supposedly offering one or more months of the premium subscription.

When Will ChatGPT Plus Subscriptions Resume?

So far, we don’t have any official word on when ChatGPT Plus subscriptions will resume. We know the GPT Store is set to open early next year after recent boardroom drama led to “unexpected delays.”

Therefore, we hope that OpenAI will onboard waitlisted users in time to try out all of the GPTs created by OpenAI and community builders.

What Are GPTs?

GPTs allow users to create one or more personalized ChatGPT experiences based on a specific set of instructions, knowledge files, and actions.

Search marketers with ChatGPT Plus can try GPTs for helpful content assessment and learning SEO.

There are also GPTs for analyzing Google Search Console data.

And GPTs that will let you chat with analytics data from 20 platforms, including Google Ads, GA4, and Facebook.

Google search has indexed hundreds of public GPTs. According to an alleged list of GPT statistics in a GitHub repository, DALL-E, the top GPT from OpenAI, has received 5,620,981 visits since its launch last month. Included in the top 20 GPTs is Canva, with 291,349 views.


Weighing The Benefits Of The Pause

Ideally, this means that developers working on building GPTs and using the API should encounter fewer issues (like being unable to save GPT drafts).

But it could also mean a temporary decrease in new users of GPTs since they are only available to Plus subscribers – including the ones I tested for learning about ranking factors and gaining insights on E-E-A-T from Google’s Search Quality Rater Guidelines.

custom gpts for seoScreenshot from ChatGPT, November 2023custom gpts for seo

Featured image: Robert Way/Shutterstock

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading