SEO
Competitor Backlink Analysis With Python [Complete Script]
In my last article, we analyzed our backlinks using data from Ahrefs.
This time around, we’re including the competitor backlinks in our analysis using the same Ahrefs data source for comparison.
Like last time, we defined the value of a site’s backlinks for SEO as a product of quality and quantity.
Quality is domain authority (or Ahrefs’ equivalent domain rating) and quantity is the number of referring domains.
Again, we’ll evaluate the link quality with the available data before evaluating the quantity.
Time to code.
import re import time import random import pandas as pd import numpy as np import datetime from datetime import timedelta from plotnine import * import matplotlib.pyplot as plt from pandas.api.types import is_string_dtype from pandas.api.types import is_numeric_dtype import uritools pd.set_option('display.max_colwidth', None) %matplotlib inline
root_domain = 'johnsankey.co.uk' hostdomain = 'www.johnsankey.co.uk' hostname="johnsankey" full_domain = 'https://www.johnsankey.co.uk' target_name="John Sankey"
Data Import & Cleaning
We set up the file directories to read multiple Ahrefs exported data files in one folder, which is much faster, less boring, and more efficient than reading each file individually.
Especially when you have more than 10 of them!
ahrefs_path="data/"
The listdir( ) function from the OS module allows us to list all files in a subdirectory.
ahrefs_filenames = os.listdir(ahrefs_path) ahrefs_filenames.remove('.DS_Store') ahrefs_filenames File names now listed below: ['www.davidsonlondon.com--refdomains-subdomain__2022-03-13_23-37-29.csv', 'www.stephenclasper.co.uk--refdomains-subdoma__2022-03-13_23-47-28.csv', 'www.touchedinteriors.co.uk--refdomains-subdo__2022-03-13_23-42-05.csv', 'www.lushinteriors.co--refdomains-subdomains__2022-03-13_23-44-34.csv', 'www.kassavello.com--refdomains-subdomains__2022-03-13_23-43-19.csv', 'www.tulipinterior.co.uk--refdomains-subdomai__2022-03-13_23-41-04.csv', 'www.tgosling.com--refdomains-subdomains__2022-03-13_23-38-44.csv', 'www.onlybespoke.com--refdomains-subdomains__2022-03-13_23-45-28.csv', 'www.williamgarvey.co.uk--refdomains-subdomai__2022-03-13_23-43-45.csv', 'www.hadleyrose.co.uk--refdomains-subdomains__2022-03-13_23-39-31.csv', 'www.davidlinley.com--refdomains-subdomains__2022-03-13_23-40-25.csv', 'johnsankey.co.uk-refdomains-subdomains__2022-03-18_15-15-47.csv']
With the files listed, we’ll now read each one individually using a for loop, and add these to a dataframe.
While reading in the file we’ll use some string manipulation to create a new column with the site name of the data we’re importing.
ahrefs_df_lst = list() ahrefs_colnames = list() for filename in ahrefs_filenames: df = pd.read_csv(ahrefs_path + filename) df['site'] = filename df['site'] = df['site'].str.replace('www.', '', regex = False) df['site'] = df['site'].str.replace('.csv', '', regex = False) df['site'] = df['site'].str.replace('-.+', '', regex = True) ahrefs_colnames.append(df.columns) ahrefs_df_lst.append(df) ahrefs_df_raw = pd.concat(ahrefs_df_lst) ahrefs_df_raw
Image from Ahrefs, May 2022
Now we have the raw data from each site in a single dataframe. The next step is to tidy up the column names and make them a bit friendlier to work with.
Although the repetition could be eliminated with a custom function or a list comprehension, it is good practice and easier for beginner SEO Pythonistas to see what’s happening step by step. As they say, “repetition is the mother of mastery,” so get practicing!
competitor_ahrefs_cleancols = ahrefs_df_raw competitor_ahrefs_cleancols.columns = [col.lower() for col in competitor_ahrefs_cleancols.columns] competitor_ahrefs_cleancols.columns = [col.replace(' ','_') for col in competitor_ahrefs_cleancols.columns] competitor_ahrefs_cleancols.columns = [col.replace('.','_') for col in competitor_ahrefs_cleancols.columns] competitor_ahrefs_cleancols.columns = [col.replace('__','_') for col in competitor_ahrefs_cleancols.columns] competitor_ahrefs_cleancols.columns = [col.replace('(','') for col in competitor_ahrefs_cleancols.columns] competitor_ahrefs_cleancols.columns = [col.replace(')','') for col in competitor_ahrefs_cleancols.columns] competitor_ahrefs_cleancols.columns = [col.replace('%','') for col in competitor_ahrefs_cleancols.columns]
The count column and having a single value column (‘project’) are useful for groupby and aggregation operations.
competitor_ahrefs_cleancols['rd_count'] = 1 competitor_ahrefs_cleancols['project'] = target_name competitor_ahrefs_cleancols
The columns are cleaned up, so now we’ll clean up the row data.
competitor_ahrefs_clean_dtypes = competitor_ahrefs_cleancols
For referring domains, we’re replacing hyphens with zero and setting the data type as an integer (i.e., whole number).
This will be repeated for linked domains, also.
competitor_ahrefs_clean_dtypes['dofollow_ref_domains'] = np.where(competitor_ahrefs_clean_dtypes['dofollow_ref_domains'] == '-', 0, competitor_ahrefs_clean_dtypes['dofollow_ref_domains']) competitor_ahrefs_clean_dtypes['dofollow_ref_domains'] = competitor_ahrefs_clean_dtypes['dofollow_ref_domains'].astype(int) # linked_domains competitor_ahrefs_clean_dtypes['dofollow_linked_domains'] = np.where(competitor_ahrefs_clean_dtypes['dofollow_linked_domains'] == '-', 0, competitor_ahrefs_clean_dtypes['dofollow_linked_domains']) competitor_ahrefs_clean_dtypes['dofollow_linked_domains'] = competitor_ahrefs_clean_dtypes['dofollow_linked_domains'].astype(int)
First seen gives us a date point at which links were found, which we can use for time series plotting and deriving the link age.
We’ll convert to date format using the to_datetime function.
# first_seen competitor_ahrefs_clean_dtypes['first_seen'] = pd.to_datetime(competitor_ahrefs_clean_dtypes['first_seen'], format="%d/%m/%Y %H:%M") competitor_ahrefs_clean_dtypes['first_seen'] = competitor_ahrefs_clean_dtypes['first_seen'].dt.normalize() competitor_ahrefs_clean_dtypes['month_year'] = competitor_ahrefs_clean_dtypes['first_seen'].dt.to_period('M')
To calculate the link_age we’ll simply deduct the first seen date from today’s date and convert the difference into a number.
# link age competitor_ahrefs_clean_dtypes['link_age'] = dt.datetime.now() - competitor_ahrefs_clean_dtypes['first_seen'] competitor_ahrefs_clean_dtypes['link_age'] = competitor_ahrefs_clean_dtypes['link_age'] competitor_ahrefs_clean_dtypes['link_age'] = competitor_ahrefs_clean_dtypes['link_age'].astype(int) competitor_ahrefs_clean_dtypes['link_age'] = (competitor_ahrefs_clean_dtypes['link_age']/(3600 * 24 * 1000000000)).round(0)
The target column helps us distinguish the “client” site vs competitors which is useful for visualization later.
competitor_ahrefs_clean_dtypes['target'] = np.where(competitor_ahrefs_clean_dtypes['site'].str.contains('johns'), 1, 0) competitor_ahrefs_clean_dtypes['target'] = competitor_ahrefs_clean_dtypes['target'].astype('category') competitor_ahrefs_clean_dtypes
Now that the data is cleaned up both in terms of column titles and row values we’re ready to set forth and start analyzing.
Link Quality
We start with Link Quality which we’ll accept Domain Rating (DR) as the measure.
Let’s start by inspecting the distributive properties of DR by plotting their distribution using the geom_bokplot function.
comp_dr_dist_box_plt = ( ggplot(competitor_ahrefs_analysis.loc[competitor_ahrefs_analysis['dr'] > 0], aes(x = 'reorder(site, dr)', y = 'dr', colour="target")) + geom_boxplot(alpha = 0.6) + scale_y_continuous() + theme(legend_position = 'none', axis_text_x=element_text(rotation=90, hjust=1) )) comp_dr_dist_box_plt.save(filename="images/4_comp_dr_dist_box_plt.png", height=5, width=10, units="in", dpi=1000) comp_dr_dist_box_plt
The plot compares the site’s statistical properties side by side, and most notably, the interquartile range showing where most referring domains fall in terms of domain rating.
We also see that John Sankey has the fourth-highest median domain rating, which compares well with link quality against other sites.
William Garvey has the most diverse range of DR compared with other domains, indicating ever so slightly more relaxed criteria for link acquisition. Who knows.
Link Volumes
That’s quality. What about the volume of links from referring domains?
To tackle that, we’ll compute a running sum of referring domains using the groupby function.
competitor_count_cumsum_df = competitor_ahrefs_analysis competitor_count_cumsum_df = competitor_count_cumsum_df.groupby(['site', 'month_year'])['rd_count'].sum().reset_index()
The expanding function allows the calculation window to grow with the number of rows which is how we achieve our running sum.
competitor_count_cumsum_df['count_runsum'] = competitor_count_cumsum_df['rd_count'].expanding().sum() competitor_count_cumsum_df
The result is a data frame with the site, month_year and count_runsum (the running sum), which is in the perfect format to feed the graph.
competitor_count_cumsum_plt = ( ggplot(competitor_count_cumsum_df, aes(x = 'month_year', y = 'count_runsum', group = 'site', colour="site")) + geom_line(alpha = 0.6, size = 2) + labs(y = 'Running Sum of Referring Domains', x = 'Month Year') + scale_y_continuous() + scale_x_date() + theme(legend_position = 'right', axis_text_x=element_text(rotation=90, hjust=1) ))
competitor_count_cumsum_plt.save(filename="images/5_count_cumsum_smooth_plt.png", height=5, width=10, units="in", dpi=1000) competitor_count_cumsum_plt
The plot shows the number of referring domains for each site since 2014.
I find quite interesting the different starting positions for each site when they start acquiring links.
For example, William Garvey started with over 5,000 domains. I’d love to know who their PR agency is!
We can also see the rate of growth. For example, although Hadley Rose started link acquisition in 2018, things really took off around mid-2021.
More, More, And More
You can always do more scientific analysis.
For example, one immediate and natural extension of the above would be to combine both the quality (DR) and the quantity (volume) for a more holistic view of how the sites compare in terms of offsite SEO.
Other extensions would be to model the qualities of those referring domains for both your own and your competitor sites to see which link features (such as the number of words or relevance of the linking content) could explain the difference in visibility between you and your competitors.
This model extension would be a good application of these machine learning techniques.
More resources:
Featured Image: F8 studio/Shutterstock
!function(f,b,e,v,n,t,s)
{if(f.fbq)return;n=f.fbq=function(){n.callMethod?
n.callMethod.apply(n,arguments):n.queue.push(arguments)};
if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version=’2.0′;
n.queue=[];t=b.createElement(e);t.async=!0;
t.src=v;s=b.getElementsByTagName(e)[0];
s.parentNode.insertBefore(t,s)}(window,document,’script’,
‘https://connect.facebook.net/en_US/fbevents.js’);
if( typeof sopp !== “undefined” && sopp === ‘yes’ ){
fbq(‘dataProcessingOptions’, [‘LDU’], 1, 1000);
}else{
fbq(‘dataProcessingOptions’, []);
}
fbq(‘init’, ‘1321385257908563’);
fbq(‘track’, ‘PageView’);
fbq(‘trackSingle’, ‘1321385257908563’, ‘ViewContent’, {
content_name: ‘competitor-backlinks-python’,
content_category: ‘linkbuilding marketing-analytics seo’
});
SEO
How To Write ChatGPT Prompts To Get The Best Results
ChatGPT is a game changer in the field of SEO. This powerful language model can generate human-like content, making it an invaluable tool for SEO professionals.
However, the prompts you provide largely determine the quality of the output.
To unlock the full potential of ChatGPT and create content that resonates with your audience and search engines, writing effective prompts is crucial.
In this comprehensive guide, we’ll explore the art of writing prompts for ChatGPT, covering everything from basic techniques to advanced strategies for layering prompts and generating high-quality, SEO-friendly content.
Writing Prompts For ChatGPT
What Is A ChatGPT Prompt?
A ChatGPT prompt is an instruction or discussion topic a user provides for the ChatGPT AI model to respond to.
The prompt can be a question, statement, or any other stimulus to spark creativity, reflection, or engagement.
Users can use the prompt to generate ideas, share their thoughts, or start a conversation.
ChatGPT prompts are designed to be open-ended and can be customized based on the user’s preferences and interests.
How To Write Prompts For ChatGPT
Start by giving ChatGPT a writing prompt, such as, “Write a short story about a person who discovers they have a superpower.”
ChatGPT will then generate a response based on your prompt. Depending on the prompt’s complexity and the level of detail you requested, the answer may be a few sentences or several paragraphs long.
Use the ChatGPT-generated response as a starting point for your writing. You can take the ideas and concepts presented in the answer and expand upon them, adding your own unique spin to the story.
If you want to generate additional ideas, try asking ChatGPT follow-up questions related to your original prompt.
For example, you could ask, “What challenges might the person face in exploring their newfound superpower?” Or, “How might the person’s relationships with others be affected by their superpower?”
Remember that ChatGPT’s answers are generated by artificial intelligence and may not always be perfect or exactly what you want.
However, they can still be a great source of inspiration and help you start writing.
Must-Have GPTs Assistant
I recommend installing the WebBrowser Assistant created by the OpenAI Team. This tool allows you to add relevant Bing results to your ChatGPT prompts.
This assistant adds the first web results to your ChatGPT prompts for more accurate and up-to-date conversations.
It is very easy to install in only two clicks. (Click on Start Chat.)
For example, if I ask, “Who is Vincent Terrasi?,” ChatGPT has no answer.
With WebBrower Assistant, the assistant creates a new prompt with the first Bing results, and now ChatGPT knows who Vincent Terrasi is.
You can test other GPT assistants available in the GPTs search engine if you want to use Google results.
Master Reverse Prompt Engineering
ChatGPT can be an excellent tool for reverse engineering prompts because it generates natural and engaging responses to any given input.
By analyzing the prompts generated by ChatGPT, it is possible to gain insight into the model’s underlying thought processes and decision-making strategies.
One key benefit of using ChatGPT to reverse engineer prompts is that the model is highly transparent in its decision-making.
This means that the reasoning and logic behind each response can be traced, making it easier to understand how the model arrives at its conclusions.
Once you’ve done this a few times for different types of content, you’ll gain insight into crafting more effective prompts.
Prepare Your ChatGPT For Generating Prompts
First, activate the reverse prompt engineering.
- Type the following prompt: “Enable Reverse Prompt Engineering? By Reverse Prompt Engineering I mean creating a prompt from a given text.”
ChatGPT is now ready to generate your prompt. You can test the product description in a new chatbot session and evaluate the generated prompt.
- Type: “Create a very technical reverse prompt engineering template for a product description about iPhone 11.”
The result is amazing. You can test with a full text that you want to reproduce. Here is an example of a prompt for selling a Kindle on Amazon.
- Type: “Reverse Prompt engineer the following {product), capture the writing style and the length of the text :
product =”
I tested it on an SEJ blog post. Enjoy the analysis – it is excellent.
- Type: “Reverse Prompt engineer the following {text}, capture the tone and writing style of the {text} to include in the prompt :
text = all text coming from https://www.searchenginejournal.com/google-bard-training-data/478941/”
But be careful not to use ChatGPT to generate your texts. It is just a personal assistant.
Go Deeper
Prompts and examples for SEO:
- Keyword research and content ideas prompt: “Provide a list of 20 long-tail keyword ideas related to ‘local SEO strategies’ along with brief content topic descriptions for each keyword.”
- Optimizing content for featured snippets prompt: “Write a 40-50 word paragraph optimized for the query ‘what is the featured snippet in Google search’ that could potentially earn the featured snippet.”
- Creating meta descriptions prompt: “Draft a compelling meta description for the following blog post title: ’10 Technical SEO Factors You Can’t Ignore in 2024′.”
Important Considerations:
- Always Fact-Check: While ChatGPT can be a helpful tool, it’s crucial to remember that it may generate inaccurate or fabricated information. Always verify any facts, statistics, or quotes generated by ChatGPT before incorporating them into your content.
- Maintain Control and Creativity: Use ChatGPT as a tool to assist your writing, not replace it. Don’t rely on it to do your thinking or create content from scratch. Your unique perspective and creativity are essential for producing high-quality, engaging content.
- Iteration is Key: Refine and revise the outputs generated by ChatGPT to ensure they align with your voice, style, and intended message.
Additional Prompts for Rewording and SEO:
– Rewrite this sentence to be more concise and impactful.
– Suggest alternative phrasing for this section to improve clarity.
– Identify opportunities to incorporate relevant internal and external links.
– Analyze the keyword density and suggest improvements for better SEO.
Remember, while ChatGPT can be a valuable tool, it’s essential to use it responsibly and maintain control over your content creation process.
Experiment And Refine Your Prompting Techniques
Writing effective prompts for ChatGPT is an essential skill for any SEO professional who wants to harness the power of AI-generated content.
Hopefully, the insights and examples shared in this article can inspire you and help guide you to crafting stronger prompts that yield high-quality content.
Remember to experiment with layering prompts, iterating on the output, and continually refining your prompting techniques.
This will help you stay ahead of the curve in the ever-changing world of SEO.
More resources:
Featured Image: Tapati Rinchumrus/Shutterstock
SEO
Measuring Content Impact Across The Customer Journey
Understanding the impact of your content at every touchpoint of the customer journey is essential – but that’s easier said than done. From attracting potential leads to nurturing them into loyal customers, there are many touchpoints to look into.
So how do you identify and take advantage of these opportunities for growth?
Watch this on-demand webinar and learn a comprehensive approach for measuring the value of your content initiatives, so you can optimize resource allocation for maximum impact.
You’ll learn:
- Fresh methods for measuring your content’s impact.
- Fascinating insights using first-touch attribution, and how it differs from the usual last-touch perspective.
- Ways to persuade decision-makers to invest in more content by showcasing its value convincingly.
With Bill Franklin and Oliver Tani of DAC Group, we unravel the nuances of attribution modeling, emphasizing the significance of layering first-touch and last-touch attribution within your measurement strategy.
Check out these insights to help you craft compelling content tailored to each stage, using an approach rooted in first-hand experience to ensure your content resonates.
Whether you’re a seasoned marketer or new to content measurement, this webinar promises valuable insights and actionable tactics to elevate your SEO game and optimize your content initiatives for success.
View the slides below or check out the full webinar for all the details.
SEO
How to Find and Use Competitor Keywords
Competitor keywords are the keywords your rivals rank for in Google’s search results. They may rank organically or pay for Google Ads to rank in the paid results.
Knowing your competitors’ keywords is the easiest form of keyword research. If your competitors rank for or target particular keywords, it might be worth it for you to target them, too.
There is no way to see your competitors’ keywords without a tool like Ahrefs, which has a database of keywords and the sites that rank for them. As far as we know, Ahrefs has the biggest database of these keywords.
How to find all the keywords your competitor ranks for
- Go to Ahrefs’ Site Explorer
- Enter your competitor’s domain
- Go to the Organic keywords report
The report is sorted by traffic to show you the keywords sending your competitor the most visits. For example, Mailchimp gets most of its organic traffic from the keyword “mailchimp.”
Since you’re unlikely to rank for your competitor’s brand, you might want to exclude branded keywords from the report. You can do this by adding a Keyword > Doesn’t contain filter. In this example, we’ll filter out keywords containing “mailchimp” or any potential misspellings:
If you’re a new brand competing with one that’s established, you might also want to look for popular low-difficulty keywords. You can do this by setting the Volume filter to a minimum of 500 and the KD filter to a maximum of 10.
How to find keywords your competitor ranks for, but you don’t
- Go to Competitive Analysis
- Enter your domain in the This target doesn’t rank for section
- Enter your competitor’s domain in the But these competitors do section
Hit “Show keyword opportunities,” and you’ll see all the keywords your competitor ranks for, but you don’t.
You can also add a Volume and KD filter to find popular, low-difficulty keywords in this report.
How to find keywords multiple competitors rank for, but you don’t
- Go to Competitive Analysis
- Enter your domain in the This target doesn’t rank for section
- Enter the domains of multiple competitors in the But these competitors do section
You’ll see all the keywords that at least one of these competitors ranks for, but you don’t.
You can also narrow the list down to keywords that all competitors rank for. Click on the Competitors’ positions filter and choose All 3 competitors:
- Go to Ahrefs’ Site Explorer
- Enter your competitor’s domain
- Go to the Paid keywords report
This report shows you the keywords your competitors are targeting via Google Ads.
Since your competitor is paying for traffic from these keywords, it may indicate that they’re profitable for them—and could be for you, too.
You know what keywords your competitors are ranking for or bidding on. But what do you do with them? There are basically three options.
1. Create pages to target these keywords
You can only rank for keywords if you have content about them. So, the most straightforward thing you can do for competitors’ keywords you want to rank for is to create pages to target them.
However, before you do this, it’s worth clustering your competitor’s keywords by Parent Topic. This will group keywords that mean the same or similar things so you can target them all with one page.
Here’s how to do that:
- Export your competitor’s keywords, either from the Organic Keywords or Content Gap report
- Paste them into Keywords Explorer
- Click the “Clusters by Parent Topic” tab
For example, MailChimp ranks for keywords like “what is digital marketing” and “digital marketing definition.” These and many others get clustered under the Parent Topic of “digital marketing” because people searching for them are all looking for the same thing: a definition of digital marketing. You only need to create one page to potentially rank for all these keywords.
2. Optimize existing content by filling subtopics
You don’t always need to create new content to rank for competitors’ keywords. Sometimes, you can optimize the content you already have to rank for them.
How do you know which keywords you can do this for? Try this:
- Export your competitor’s keywords
- Paste them into Keywords Explorer
- Click the “Clusters by Parent Topic” tab
- Look for Parent Topics you already have content about
For example, if we analyze our competitor, we can see that seven keywords they rank for fall under the Parent Topic of “press release template.”
If we search our site, we see that we already have a page about this topic.
If we click the caret and check the keywords in the cluster, we see keywords like “press release example” and “press release format.”
To rank for the keywords in the cluster, we can probably optimize the page we already have by adding sections about the subtopics of “press release examples” and “press release format.”
3. Target these keywords with Google Ads
Paid keywords are the simplest—look through the report and see if there are any relevant keywords you might want to target, too.
For example, Mailchimp is bidding for the keyword “how to create a newsletter.”
If you’re ConvertKit, you may also want to target this keyword since it’s relevant.
If you decide to target the same keyword via Google Ads, you can hover over the magnifying glass to see the ads your competitor is using.
You can also see the landing page your competitor directs ad traffic to under the URL column.
Learn more
Check out more tutorials on how to do competitor keyword analysis:
-
PPC5 days ago
19 Best SEO Tools in 2024 (For Every Use Case)
-
SEARCHENGINES7 days ago
Daily Search Forum Recap: April 17, 2024
-
SEO7 days ago
An In-Depth Guide And Best Practices For Mobile SEO
-
SEARCHENGINES6 days ago
Daily Search Forum Recap: April 18, 2024
-
SEARCHENGINES5 days ago
Daily Search Forum Recap: April 19, 2024
-
MARKETING6 days ago
Ecommerce evolution: Blurring the lines between B2B and B2C
-
SEO6 days ago
2024 WordPress Vulnerability Report Shows Errors Sites Keep Making
-
WORDPRESS5 days ago
How to Make $5000 of Passive Income Every Month in WordPress
You must be logged in to post a comment Login