Connect with us

SEO

SEO Trends, Organic Growth & Personal Branding With Craig Campbell

Published

on

SEO Trends, Organic Growth & Personal Branding With Craig Campbell

The game of search has evolved into something far beyond just optimizing your site for Google alone.

Digital marketers are pressured to deliver results and are often conflicted about which marketing channels to prioritize or add into the mix.

“You don’t want to focus on one way of getting traffic.” That’s the advice of Craig Campbell: a well-known SEO professional from Glasgow and PageOptimizer Pro’s #1 most influential SEO of 2020.

We had a chance to catch up with the man behind the SEO YouTube channel that welcomes you with “knowledge bombs that will make you money” – to get his take on the latest SEO trends, digital marketing tools worth checking out, and agency life.

Read on and glean new insight as he shares bits of hard-earned wisdom from his 20 years of experience in the SEO industry.

Past And Future SEO Trends

Being in the SEO business for two decades, how much has the SEO landscape changed since you first became interested?

Craig Campbell: “While it has changed a lot in some ways, we still have the core fundamentals of content and links being massively important: the same way they were at the very start of my journey in this industry.

Sure, things have evolved a great deal, and the quality of content, the relevance of links, and a lot of other nuances are in place. But the basics are still very similar.

What I do love is that these days, the learning curve is a lot easier, and we have clever people all over the world creating amazing tools to help us with competitor analysis and much more.

Whereas back then, it was a lot of trial and error, embracing the changes and utilizing the tools to make the job a lot easier has helped a lot over recent years.

But I think, for me, learning how to do it the hard way, using my own brain and common sense, and not having everything handed to me on a plate … it really did help me learn.

It took longer, but I won’t lie – it was a lot of fun, too. So these days, I find SEO a lot easier as I once had to do it the hard way.”

What do you know about SEO now that you wish you’d known when you first started?

CC: “I’ve been asked this a lot. I’ve enjoyed the whole journey. And I’ve made countless mistakes, but they have gotten me to where I am today. However, one thing I struggled with at the start was building SOPs and training my internal team to do the tasks I wanted to do.

For many years, I struggled to do this properly, and it massively hampered my ability to scale and contributed a lot of unnecessary stress to my life. So, learning to delegate and building SOPs [standard operating procedures] much sooner would have been good.

Other things, like trying and testing for myself and trying to read between the lines when I watch a talk or presentation, are things I wish I had done. I was a little naive back in the day and used to take things at face value and would simply add some of what other people had said without doing my own testing.

Like many others at the start of their careers, I didn’t know how good I was, but there becomes this part of the journey where you undervalue yourself or allow your prices to be driven down, and before you know it, you have a whole heap of clients who are paying you very little and wasting all of your time, energy, and resources.

So, I wish someone had sat me down and tried to give me that advice. But unfortunately, we [were] all in a similar position when I started in the early 2000s – no one knew what they were doing, let alone their actual value.”

Where do you expect the SEO industry is heading in the next three years?

CC: “This is a question that is really difficult to answer; I’ve seen and heard people say things over the years like ‘voice search‘ is the next big thing, and ‘let’s all double down on that.’

We have seen people talk about ‘AMP‘ and many other things, including AI content and how we will replace content writers with AI. I don’t think a lot of these things have worked out too well.

And without being a specialist in technology and how all of these things are being developed, I don’t see any major dramatic changes over the coming years.

It’s clear as day that Google is trying to force the organic search positions further down. However, organic traffic still converts really well.

But 20 years into the industry, I still see many websites and SEOs still not doing the basics properly. So, I think people need to level up on their processes and SOPs and how they see their website and start to treat them as a real business. I think that’s where people will see gains over the next few years.

Nothing massively new in terms of major changes to the industry; we do evolve, Google does bring updates out, and of course, those cutting corners or not doing the basics right eventually get penalized in some way, shape, or form.”

Marketing Tools And Channels To Drive Traffic

Is there one SEO tool, in particular, that you’d recommend for local businesses?

CC: “One tool, for local, is really hard. I use a number of tools for different elements of local, like Local Falcon, for checking out my Google Business Profile’s ranking positions.

I really do think even now, many small businesses don’t realize how much traffic comes from those map positions.

For sure, loads of people do it in our SEO community, but overall that’s, in reality, a small part of the world. I see so many businesses out there who are not even ranking those, let alone local landing pages.”

How about a particular marketing channel that can be beneficial for driving organic traffic?

CC: “This is something I’m often missing out on. Platforms, such as Reddit and Pinterest, are ones I hear people getting amazing traffic from, but I’ve yet to dive into them properly.

I recently bought a Pinterest course to try and work out what people are doing on there to get all this traffic. But over the years, I have built up a good email list, always capturing people’s data – a very old-school way of marketing, but email marketing works really well even now.

Social media, in general – Facebook, LinkedIn, Twitter, Instagram, and TikTok – are the ones I’m on.

Now, it has been reported that TikTok gets more traffic than Google itself. Not sure how accurate that statement is, but what I do know is that there are a ton of people on that platform, and it would be stupid to ignore it due to the sheer volume of people on there.

As an SEO, I’m always looking at ways to drive good traffic to my website, whether that be paid, social, emails, or retargeting via pixels. I think you need to try and grab what you can out there. You don’t want to focus on one way of getting traffic.

YouTube, over the last few years, has been an amazing platform for me personally. When COVID hit, I took the opportunity to do a lot more video content, and that has worked very well in my favor.”

Taking The Leap To Build A Personal Brand

What should a digital marketer know about being in an SEO agency from Day 1?

CC: “I think they should learn as much as they can from agency life, see it as their apprenticeship, and learn as much as they can on the processes, reporting, how to retain clients, and all of the amazing stuff that agencies do very well.

But they should also know that there is a lot of fluff on the agency side and a lot of client deliverables that don’t always mean they are good from an SEO perspective.

I’m not saying every agency does all the fluff or offers substandard work. But I do see a lot of people who come out of agencies and believe in all the fluff that they are trained to speak to clients about.

So, I think, in general, they should know that playing the actual SEO game against what we tell clients is very often a different game. So, they should know the difference, which will stand them in good stead when they leave agency life.

I have a very good friend whom I’ve watched grow in recent years. Ryan Darani worked for a big digital agency, and for sure, he learned some amazing things there, which still work very well in his favor (mainly from an audit, reporting, and technical perspective).

However, there were areas of weakness and some bad agency mindset that had to be ironed out now that he is a freelancer. He has adapted very well and is doing amazing for someone who went out on his own just two years ago.

But overall, grab all you can from agency life, particularly those SOP processes, reporting – all the technical stuff you can, as this is often something many people who haven’t experienced agency life fall short on.”

What’s been your greatest digital marketing achievement to date?

CC: “The best achievement, other than some of the website flips and money gains I’ve made on certain projects – which, of course, no one really cares about – would be making the transition from unknown agency owner into becoming a personal brand.

A lot of people think that it is an easy task. The reality is, speaking at conferences, being on video, and offering value upfront is a lot of hard work. Not just traveling to conferences but speaking in front of an audience took me outside my comfort zone.

Being sat on YouTube, doing podcasts, and all of the other stuff was something I had never done before; and even in my school days, I hated speaking in front of an audience.

Watching many others build up personal brands while I was building my agency was great to watch, and I always had a [voice] inside me saying, ‘You can do this! Why don’t you go and do it? Why let anyone else get up there and get the exposure?’

You have to believe in yourself and make sure that you get yourself up there. While many folks will not want to do that because they are shy, an introvert, or whatever, when speaking to other speakers, they all have similar fears or get nervous before speaking.

And I, for sure, had serious nerves at the start of my speaking career, and it was amazing to push through and overcome those fears, and that was a massive achievement for me.”

Key To SEO And Career Growth

Can you share any SEO growth hack that always works for you?

CC: “For many years, I’ve always seen traffic work very well when sent to a video, blog post, page, or whatever. Even if we take LinkedIn, for example.

If I do a post on LinkedIn, and someone in my network likes it, comments on it, or shares it, that post is then seen by their friends, which turns into more engagement, and then their friends see it and hopefully comment and like the post.

When Google sees something that is widely engaged, it ranks it well. The same goes for any social media platform when you want a post to go viral.

So, tip 1: Offer value upfront. Don’t put out bland, boring content; people will simply not engage. Try and offer some value upfront.

Tip 2: So, when I do a blog post, I will then send it to my push notification subscribers. It then goes out on social media. If it’s a really good post, it will also go out to my mailing list. I then also might do some paid social ads.

This kickstarts the post, article, or whatever you are trying to put out there, but you must utilize your own audience first and use a sequence of events to get traffic onto your articles, which in turn, if done well, should give you the lift you need to make the post viral to some degree.”

What advice do you have for those just getting started in their SEO careers or launching their startup?

CC: “I see so many people early in their careers or when they launch a start-up analyze every single small detail before taking action. I’d highly recommend simply taking action. Why over-analyze things? Keep it simple and use common sense.

A bit of effort never goes far wrong in this industry, and it is always good to learn from mistakes you make anyway. Just start taking action.

I’ve made more mistakes than most, but as long as I learn from them, then it’s always a good thing.

You will never ever hit your goals straight off the bat; whether it’s your SEO career or a project you’re working on, things can be tweaked as you go. No one in this game knows 100% of what they are doing, so don’t be fooled by anyone suggesting that they do.

Read between the lines and never be scared to test and add your own mix to things.”

Check out this SEJ Show episode with Loren Baker, where Campbell shared his insights on domain leasing, link-building best practices, and a lot more.

More Resources:


Featured Image: Courtesy of Craig Campbell/SEO Glasgow



Source link

SEO

Essential Functions For SEO Data Analysis

Published

on

Essential Functions For SEO Data Analysis

Learning to code, whether with PythonJavaScript, or another programming language, has a whole host of benefits, including the ability to work with larger datasets and automate repetitive tasks.

But despite the benefits, many SEO professionals are yet to make the transition – and I completely understand why! It isn’t an essential skill for SEO, and we’re all busy people.

If you’re pressed for time, and you already know how to accomplish a task within Excel or Google Sheets, then changing tack can feel like reinventing the wheel.

When I first started coding, I initially only used Python for tasks that I couldn’t accomplish in Excel – and it’s taken several years to get to the point where it’s my defacto choice for data processing.

Looking back, I’m incredibly glad that I persisted, but at times it was a frustrating experience, with many an hour spent scanning threads on Stack Overflow.

This post is designed to spare other SEO pros the same fate.

Within it, we’ll cover the Python equivalents of the most commonly used Excel formulas and features for SEO data analysis – all of which are available within a Google Colab notebook linked in the summary.

Specifically, you’ll learn the equivalents of:

  • LEN.
  • Drop Duplicates.
  • Text to Columns.
  • SEARCH/FIND.
  • CONCATENATE.
  • Find and Replace.
  • LEFT/MID/RIGHT.
  • IF.
  • IFS.
  • VLOOKUP.
  • COUNTIF/SUMIF/AVERAGEIF.
  • Pivot Tables.

Amazingly, to accomplish all of this, we’ll primarily be using a singular library – Pandas – with a little help in places from its big brother, NumPy.

Prerequisites

For the sake of brevity, there are a few things we won’t be covering today, including:

  • Installing Python.
  • Basic Pandas, like importing CSVs, filtering, and previewing dataframes.

If you’re unsure about any of this, then Hamlet’s guide on Python data analysis for SEO is the perfect primer.

Now, without further ado, let’s jump in.

LEN

LEN provides a count of the number of characters within a string of text.

For SEO specifically, a common use case is to measure the length of title tags or meta descriptions to determine whether they’ll be truncated in search results.

Within Excel, if we wanted to count the second cell of column A, we’d enter:

=LEN(A2)
Screenshot from Microsoft Excel, November 2022

Python isn’t too dissimilar, as we can rely on the inbuilt len function, which can be combined with Pandas’ loc[] to access a specific row of data within a column:

len(df['Title'].loc[0])

In this example, we’re getting the length of the first row in the “Title” column of our dataframe.

len function python
Screenshot of VS Code, November, 2022

Finding the length of a cell isn’t that useful for SEO, though. Normally, we’d want to apply a function to an entire column!

In Excel, this would be achieved by selecting the formula cell on the bottom right-hand corner and either dragging it down or double-clicking.

When working with a Pandas dataframe, we can use str.len to calculate the length of rows within a series, then store the results in a new column:

df['Length'] = df['Title'].str.len()

Str.len is a ‘vectorized’ operation, which is designed to be applied simultaneously to a series of values. We’ll use these operations extensively throughout this article, as they almost universally end up being faster than a loop.

Another common application of LEN is to combine it with SUBSTITUTE to count the number of words in a cell:

=LEN(TRIM(A2))-LEN(SUBSTITUTE(A2," ",""))+1

In Pandas, we can achieve this by combining the str.split and str.len functions together:

df['No. Words'] = df['Title'].str.split().str.len()

We’ll cover str.split in more detail later, but essentially, what we’re doing is splitting our data based upon whitespaces within the string, then counting the number of component parts.

word count PythonScreenshot from VS Code, November 2022

Dropping Duplicates

Excel’s ‘Remove Duplicates’ feature provides an easy way to remove duplicate values within a dataset, either by deleting entirely duplicate rows (when all columns are selected) or removing rows with the same values in specific columns.

Excel drop duplicatesScreenshot from Microsoft Excel, November 2022

In Pandas, this functionality is provided by drop_duplicates.

To drop duplicate rows within a dataframe type:

df.drop_duplicates(inplace=True)

To drop rows based on duplicates within a singular column, include the subset parameter:

df.drop_duplicates(subset="column", inplace=True)

Or specify multiple columns within a list:

df.drop_duplicates(subset=['column','column2'], inplace=True)

One addition above that’s worth calling out is the presence of the inplace parameter. Including inplace=True allows us to overwrite our existing dataframe without needing to create a new one.

There are, of course, times when we want to preserve our raw data. In this case, we can assign our deduped dataframe to a different variable:

df2 = df.drop_duplicates(subset="column")

Text To Columns

Another everyday essential, the ‘text to columns’ feature can be used to split a text string based on a delimiter, such as a slash, comma, or whitespace.

As an example, splitting a URL into its domain and individual subfolders.

Excel drop duplicatesScreenshot from Microsoft Excel, November 2022

When dealing with a dataframe, we can use the str.split function, which creates a list for each entry within a series. This can be converted into multiple columns by setting the expand parameter to True:

df['URL'].str.split(pat="/", expand=True)
str split PythonScreenshot from VS Code, November 2022

As is often the case, our URLs in the image above have been broken up into inconsistent columns, because they don’t feature the same number of folders.

This can make things tricky when we want to save our data within an existing dataframe.

Specifying the n parameter limits the number of splits, allowing us to create a specific number of columns:

df[['Domain', 'Folder1', 'Folder2', 'Folder3']] = df['URL'].str.split(pat="/", expand=True, n=3)

Another option is to use pop to remove your column from the dataframe, perform the split, and then re-add it with the join function:

df = df.join(df.pop('Split').str.split(pat="/", expand=True))

Duplicating the URL to a new column before the split allows us to preserve the full URL. We can then rename the new columns:🐆

df['Split'] = df['URL']

df = df.join(df.pop('Split').str.split(pat="/", expand=True))

df.rename(columns = {0:'Domain', 1:'Folder1', 2:'Folder2', 3:'Folder3', 4:'Parameter'}, inplace=True)
Split pop join functions PythonScreenshot from VS Code, November 2022

CONCATENATE

The CONCAT function allows users to combine multiple strings of text, such as when generating a list of keywords by adding different modifiers.

In this case, we’re adding “mens” and whitespace to column A’s list of product types:

=CONCAT($F$1," ",A2)
concat Excel
Screenshot from Microsoft Excel, November 2022

Assuming we’re dealing with strings, the same can be achieved in Python using the arithmetic operator:

df['Combined] = 'mens' + ' ' + df['Keyword']

Or specify multiple columns of data:

df['Combined'] = df['Subdomain'] + df['URL']
concat PythonScreenshot from VS Code, November 2022

Pandas has a dedicated concat function, but this is more useful when trying to combine multiple dataframes with the same columns.

For instance, if we had multiple exports from our favorite link analysis tool:

df = pd.read_csv('data.csv')
df2 = pd.read_csv('data2.csv')
df3 = pd.read_csv('data3.csv')

dflist = [df, df2, df3]

df = pd.concat(dflist, ignore_index=True)

SEARCH/FIND

The SEARCH and FIND formulas provide a way of locating a substring within a text string.

These commands are commonly combined with ISNUMBER to create a Boolean column that helps filter down a dataset, which can be extremely helpful when performing tasks like log file analysis, as explained in this guide. E.g.:

=ISNUMBER(SEARCH("searchthis",A2)
isnumber search ExcelScreenshot from Microsoft Excel, November 2022

The difference between SEARCH and FIND is that find is case-sensitive.

The equivalent Pandas function, str.contains, is case-sensitive by default:

df['Journal'] = df['URL'].str.contains('engine', na=False)

Case insensitivity can be enabled by setting the case parameter to False:

df['Journal'] = df['URL'].str.contains('engine', case=False, na=False)

In either scenario, including na=False will prevent null values from being returned within the Boolean column.

One massive advantage of using Pandas here is that, unlike Excel, regex is natively supported by this function – as it is in Google sheets via REGEXMATCH.

Chain together multiple substrings by using the pipe character, also known as the OR operator:

df['Journal'] = df['URL'].str.contains('engine|search', na=False)

Find And Replace

Excel’s “Find and Replace” feature provides an easy way to individually or bulk replace one substring with another.

find replace ExcelScreenshot from Microsoft Excel, November 2022

When processing data for SEO, we’re most likely to select an entire column and “Replace All.”

The SUBSTITUTE formula provides another option here and is useful if you don’t want to overwrite the existing column.

As an example, we can change the protocol of a URL from HTTP to HTTPS, or remove it by replacing it with nothing.

When working with dataframes in Python, we can use str.replace:

df['URL'] = df['URL'].str.replace('http://', 'https://')

Or:

df['URL'] = df['URL'].str.replace('http://', '') # replace with nothing

Again, unlike Excel, regex can be used – like with Google Sheets’ REGEXREPLACE:

df['URL'] = df['URL'].str.replace('http://|https://', '')

Alternatively, if you want to replace multiple substrings with different values, you can use Python’s replace method and provide a list.

This prevents you from having to chain multiple str.replace functions:

df['URL'] = df['URL'].replace(['http://', ' https://'], ['https://www.', 'https://www.’], regex=True)

LEFT/MID/RIGHT

Extracting a substring within Excel requires the usage of the LEFT, MID, or RIGHT functions, depending on where the substring is located within a cell.

Let’s say we want to extract the root domain and subdomain from a URL:

=MID(A2,FIND(":",A2,4)+3,FIND("/",A2,9)-FIND(":",A2,4)-3)
left mid right ExcelScreenshot from Microsoft Excel, November 2022

Using a combination of MID and multiple FIND functions, this formula is ugly, to say the least – and things get a lot worse for more complex extractions.

Again, Google Sheets does this better than Excel, because it has REGEXEXTRACT.

What a shame that when you feed it larger datasets, it melts faster than a Babybel on a hot radiator.

Thankfully, Pandas offers str.extract, which works in a similar way:

df['Domain'] = df['URL'].str.extract('.*://?([^/]+)')
str extract PythonScreenshot from VS Code, November 2022

Combine with fillna to prevent null values, as you would in Excel with IFERROR:

df['Domain'] = df['URL'].str.extract('.*://?([^/]+)').fillna('-')

If

IF statements allow you to return different values, depending on whether or not a condition is met.

To illustrate, suppose that we want to create a label for keywords that are ranking within the top three positions.

Excel IFScreenshot from Microsoft Excel, November 2022

Rather than using Pandas in this instance, we can lean on NumPy and the where function (remember to import NumPy, if you haven’t already):

df['Top 3'] = np.where(df['Position'] <= 3, 'Top 3', 'Not Top 3')

Multiple conditions can be used for the same evaluation by using the AND/OR operators, and enclosing the individual criteria within round brackets:

df['Top 3'] = np.where((df['Position'] <= 3) & (df['Position'] != 0), 'Top 3', 'Not Top 3')

In the above, we’re returning “Top 3” for any keywords with a ranking less than or equal to three, excluding any keywords ranking in position zero.

IFS

Sometimes, rather than specifying multiple conditions for the same evaluation, you may want multiple conditions that return different values.

In this case, the best solution is using IFS:

=IFS(B2<=3,"Top 3",B2<=10,"Top 10",B2<=20,"Top 20")
IFS ExcelScreenshot from Microsoft Excel, November 2022

Again, NumPy provides us with the best solution when working with dataframes, via its select function.

With select, we can create a list of conditions, choices, and an optional value for when all of the conditions are false:

conditions = [df['Position'] <= 3, df['Position'] <= 10, df['Position'] <=20]

choices = ['Top 3', 'Top 10', 'Top 20']

df['Rank'] = np.select(conditions, choices, 'Not Top 20')

It’s also possible to have multiple conditions for each of the evaluations.

Let’s say we’re working with an ecommerce retailer with product listing pages (PLPs) and product display pages (PDPs), and we want to label the type of branded pages ranking within the top 10 results.

The easiest solution here is to look for specific URL patterns, such as a subfolder or extension, but what if competitors have similar patterns?

In this scenario, we could do something like this:

conditions = [(df['URL'].str.contains('/category/')) & (df['Brand Rank'] > 0),
(df['URL'].str.contains('/product/')) & (df['Brand Rank'] > 0),
(~df['URL'].str.contains('/product/')) & (~df['URL'].str.contains('/category/')) & (df['Brand Rank'] > 0)]

choices = ['PLP', 'PDP', 'Other']

df['Brand Page Type'] = np.select(conditions, choices, None)

Above, we’re using str.contains to evaluate whether or not a URL in the top 10 matches our brand’s pattern, then using the “Brand Rank” column to exclude any competitors.

In this example, the tilde sign (~) indicates a negative match. In other words, we’re saying we want every brand URL that doesn’t match the pattern for a “PDP” or “PLP” to match the criteria for ‘Other.’

Lastly, None is included because we want non-brand results to return a null value.

np select PythonScreenshot from VS Code, November 2022

VLOOKUP

VLOOKUP is an essential tool for joining together two distinct datasets on a common column.

In this case, adding the URLs within column N to the keyword, position, and search volume data in columns A-C, using the shared “Keyword” column:

=VLOOKUP(A2,M:N,2,FALSE)
vlookup ExcelScreenshot from Microsoft Excel, November 2022

To do something similar with Pandas, we can use merge.

Replicating the functionality of an SQL join, merge is an incredibly powerful function that supports a variety of different join types.

For our purposes, we want to use a left join, which will maintain our first dataframe and only merge in matching values from our second dataframe:

mergeddf = df.merge(df2, how='left', on='Keyword')

One added advantage of performing a merge over a VLOOKUP, is that you don’t have to have the shared data in the first column of the second dataset, as with the newer XLOOKUP.

It will also pull in multiple rows of data rather than the first match in finds.

One common issue when using the function is for unwanted columns to be duplicated. This occurs when multiple shared columns exist, but you attempt to match using one.

To prevent this – and improve the accuracy of your matches – you can specify a list of columns:

mergeddf = df.merge(df2, how='left', on=['Keyword', 'Search Volume'])

In certain scenarios, you may actively want these columns to be included. For instance, when attempting to merge multiple monthly ranking reports:

mergeddf = df.merge(df2, on='Keyword', how='left', suffixes=('', '_october'))
    .merge(df3, on='Keyword', how='left', suffixes=('', '_september'))

The above code snippet executes two merges to join together three dataframes with the same columns – which are our rankings for November, October, and September.

By labeling the months within the suffix parameters, we end up with a much cleaner dataframe that clearly displays the month, as opposed to the defaults of _x and _y seen in the earlier example.

multi merge PythonScreenshot from VS Code, November 2022

COUNTIF/SUMIF/AVERAGEIF

In Excel, if you want to perform a statistical function based on a condition, you’re likely to use either COUNTIF, SUMIF, or AVERAGEIF.

Commonly, COUNTIF is used to determine how many times a specific string appears within a dataset, such as a URL.

We can accomplish this by declaring the ‘URL’ column as our range, then the URL within an individual cell as our criteria:

=COUNTIF(D:D,D2)
Excel countifScreenshot from Microsoft Excel, November 2022

In Pandas, we can achieve the same outcome by using the groupby function:

df.groupby('URL')['URL'].count()
Python groupbyScreenshot from VS Code, November 2022

Here, the column declared within the round brackets indicates the individual groups, and the column listed in the square brackets is where the aggregation (i.e., the count) is performed.

The output we’re receiving isn’t perfect for this use case, though, because it’s consolidated the data.

Typically, when using Excel, we’d have the URL count inline within our dataset. Then we can use it to filter to the most frequently listed URLs.

To do this, use transform and store the output in a column:

df['URL Count'] = df.groupby('URL')['URL'].transform('count')
Python groupby transformScreenshot from VS Code, November 2022

You can also apply custom functions to groups of data by using a lambda (anonymous) function:

df['Google Count'] = df.groupby(['URL'])['URL'].transform(lambda x: x[x.str.contains('google')].count())

In our examples so far, we’ve been using the same column for our grouping and aggregations, but we don’t have to. Similarly to COUNTIFS/SUMIFS/AVERAGEIFS in Excel, it’s possible to group using one column, then apply our statistical function to another.

Going back to the earlier search engine results page (SERP) example, we may want to count all ranking PDPs on a per-keyword basis and return this number alongside our existing data:

df['PDP Count'] = df.groupby(['Keyword'])['URL'].transform(lambda x: x[x.str.contains('/product/|/prd/|/pd/')].count())
Python groupby countifsScreenshot from VS Code, November 2022

Which in Excel parlance, would look something like this:

=SUM(COUNTIFS(A:A,[@Keyword],D:D,{"*/product/*","*/prd/*","*/pd/*"}))

Pivot Tables

Last, but by no means least, it’s time to talk pivot tables.

In Excel, a pivot table is likely to be our first port of call if we want to summarise a large dataset.

For instance, when working with ranking data, we may want to identify which URLs appear most frequently, and their average ranking position.

pivot table ExcelScreenshot from Microsoft Excel, November 2022

Again, Pandas has its own pivot tables equivalent – but if all you want is a count of unique values within a column, this can be accomplished using the value_counts function:

count = df['URL'].value_counts()

Using groupby is also an option.

Earlier in the article, performing a groupby that aggregated our data wasn’t what we wanted – but it’s precisely what’s required here:

grouped = df.groupby('URL').agg(
     url_frequency=('Keyword', 'count'),
     avg_position=('Position', 'mean'),
     )

grouped.reset_index(inplace=True)
groupby-pivot PythonScreenshot from VS Code, November 2022

Two aggregate functions have been applied in the example above, but this could easily be expanded upon, and 13 different types are available.

There are, of course, times when we do want to use pivot_table, such as when performing multi-dimensional operations.

To illustrate what this means, let’s reuse the ranking groupings we made using conditional statements and attempt to display the number of times a URL ranks within each group.

ranking_groupings = df.groupby(['URL', 'Grouping']).agg(
     url_frequency=('Keyword', 'count'),
     )
python groupby groupingScreenshot from VS Code, November 2022

This isn’t the best format to use, as multiple rows have been created for each URL.

Instead, we can use pivot_table, which will display the data in different columns:

pivot = pd.pivot_table(df,
index=['URL'],
columns=['Grouping'],
aggfunc="size",
fill_value=0,
)
pivot table PythonScreenshot from VS Code, November 2022

Final Thoughts

Whether you’re looking for inspiration to start learning Python, or are already leveraging it in your SEO workflows, I hope that the above examples help you along on your journey.

As promised, you can find a Google Colab notebook with all of the code snippets here.

In truth, we’ve barely scratched the surface of what’s possible, but understanding the basics of Python data analysis will give you a solid base upon which to build.

More resources:


Featured Image: mapo_japan/Shutterstock



Source link

Continue Reading

DON'T MISS ANY IMPORTANT NEWS!
Subscribe To our Newsletter
We promise not to spam you. Unsubscribe at any time.
Invalid email address

Trending

en_USEnglish