SEO
SEO for Lawyers & Law Firms: The Complete Guide
What do you do these days when you have a question? You ask Google. And what do you do when you look for a local service? You ask Google too. That’s why lawyers, attorneys, and law firms have been using SEO to get more clients. And with this four-step guide, you can too.
But first, let’s answer an important question…
SEO (search engine optimization) is the practice of growing a website’s traffic from organic search results.
The end result of SEO is more visibility for your website on search engine results pages (SERPs) so that more people can get in touch with your business. That’s, in a nutshell, how searchers can turn into your visitors and how visitors can turn into your customers.
Moreover, the great thing about organic traffic is that it’s continuous as long as you rank and you don’t need to pay for each click you get (unlike digital advertising).
Speaking of advertising, law-related keywords can be quite expensive in the law niche. SEO allows you to take advantage of their popularity without an ad budget.
So basically, the reason why law firms, lawyers, and attorneys need SEO is the same as why they need a website: because people look for law services online. When your business doesn’t appear in Google, you simply leave money on the table.
Another way lawyers benefit from SEO is by earning potential clients’ trust with helpful content. When people look for solutions to their problems, they may find your content through Google and see that you know your stuff.
So without further ado, let’s see how lawyers can get the most out of SEO.
The Google Map Pack (also called Google Local Pack and Google Snack Pack) is a so-called rich result that Google shows to searchers to help them find the best result based on location, among other things.
In most cases, the queries your potential clients use to find businesses like yours will trigger Google’s map pack because Google “thinks” people want to find something related to a location.
As you can see, Google’s map pack is displayed on top of the organic results. And apart from the ads, it’s the first thing that searchers see. So getting your name out there dramatically increases your chances of being discovered.
No one and nothing can guarantee your place in the map pack. This is because your competition will do similar things to get there. Plus, nobody except for Google itself knows how exactly local ranking works. What we do know are the three principles Google uses fluidly to determine what goes into the local pack:
- Relevance – How well a business profile matches the meaning of the query.
- Distance – The distance between the search result and the location of the searcher or location specified in the query (e.g., “lawyer mountain view”).
- Prominence – This counts in a number of things: popularity in the “offline” world, online reviews and rankings, links to the website and, interestingly enough, rankings in the organic search results.
Based on Google’s guidelines and known local ranking factors, here are three things you should do to increase your chances of showing up in Google’s map pack.
Get and optimize your Google Business Profile
Google’s map pack is made up of Google Business Profiles, so it’s crucial that you list all of your business locations with the service (but don’t use virtual offices).
What’s more, with this profile, your business will be eligible to show up on Google Maps.
And Google will be able to display a local knowledge panel for queries, including your business name.
If you’re starting fresh, you will need to create your business profile. If the business already exists or someone else has claimed it, you may need to claim your profile instead.
The process of filling out the details in your business profile is similar in both cases. And it’s quite straightforward—a bit like setting up a social media account. But to make sure your profile is optimized, check out tips from our guide: How to Optimize Your Google My Business Listing in 30 Minutes.
Remember, the more specific information and relevant photos you share, the better. And when in doubt, check with Google’s guidelines. This is because a violation of those can lead to profile suspension.
Sidenote.
Some SEO guides state that information displayed in these rich results comes from schema markup. That’s not accurate. First and foremost, they come from business profiles. So while it doesn’t hurt to apply schema markup to your website, you should focus on optimizing your Google Business Profile.
Get listed on local citation sites and directories
Local citations and directories are online mentions of your business that display your business name, address, phone number and, in most cases, your website too.
You need them for three reasons:
- They are a ranking factor for Google Map Pack; they can help you rank higher in those results.
- While any local directories can help you rank higher in Google Map Pack, the ones that feature a link to your website can help you with organic search results too.
- They will help searchers find your business in a) search engines like Google and b) search results of those directories.
Start by getting listed with big aggregators like Foursquare. Then submit your data to popular platforms like Facebook, Yelp, and Bing Places, and go for popular directories in your local area and industry like FindLaw, Justia, or LegalMatch. Just make sure to keep your citations consistent at all times.
A method that saves your time when looking for local citations manually is to use Ahrefs’ Link Intersect tool. Just open the tool, plug in your competitors’ URLs, and leave the last input blank.
Here are some sample results. Note that you can use the tool to find other link opportunities too. (In this case, the tool shows us almost 15K domains.)
Encourage your clients to leave reviews
According to Google, positive reviews and rankings help its algorithms understand which businesses are more prominent.
You can ask your customers to leave reviews any way you like. Since we’re focusing on ranking on Google, reviews submitted there will likely be the most important ones.
Things to remember: Don’t buy reviews, don’t offer something in exchange for reviews, and try to reply to reviews as often as possible. (Here are Google’s guidelines for managing reviews.)
Recommendation
That’s about it for optimizing for Google Map Pack. Let’s move on to a slightly more complex topic of optimizing for organic results, i.e., the results below the map pack.
To stand a chance of ranking in the organic search results, you need pages with content relevant to a given search query. The more useful, interesting, and well-linked that content is, the higher your chances are. That’s what we’re going to focus on going forward.
List your services
SEO or not, you need to provide visitors with a list of services that you offer and also share where you offer your services. Some of the services will have a considerable search demand; others potentially not. Later on, we will expand on that using keyword research.
So for example, say you’re specialized in entertainment law, including a number of areas like talent contracts, music law, and publishing. The absolute minimum here is to create a page that explains your expertise in entertainment law and mentions the above specialities.
However, a more effective tactic is to create a content hub where the pillar page talks about your expertise in entertainment law in general and, at the same time, links to subpages dedicated to each area of that type of law you cover.
This page is an example of a content hub (aka topic cluster). We have the general information on entertainment law (there’s more of it on the page below that part) and links to relevant areas on the left. Each link leads to a page dedicated to an area.
And here are some results:
In short, here are some benefits of the content hub approach:
- More topical authority – Interlinks from related content build semantic relationships, which may be a signal of authoritativeness of the topic for Google (learn more).
- More link authority – Pages linked in a hub benefit from each other’s backlinks.
- A user-friendly way to navigate your website – Information is just a click away.
- More perceived value – People often see such hubs as a valuable resource on the topic (which may also increase the propensity to link to your hub).
An additional idea worth considering is creating separate hubs for practice areas and industries. This way, you will increase the number of keywords you can rank for while providing a clear structure for the user.
List your locations
The goal here is to help Google index your website for keywords with local search intent. Some will be explicit. It’s when the searcher uses a location modifier like “new york entertainment lawyer.” Some are implicit, i.e., when there is no location modifier, but Google still thinks there’s local search intent (“bakery” will show you bakeries in your area).
So here is a tactic that will save you time spent on creating a ton of pages for each location and save you from duplicate content issues:
- You can create a page (for example, one called “Contact”) with at least each location’s exact address (including the state/region), phone number, and email (if the email addresses vary).
- Include your locations in the footer. So if you have multiple locations, you can just mention the name of the region and city and link them to the page with the locations’ details.
- If you want to provide more specific information related to the locations, such as practicing lawyers, you can create subpages for each location.
- Reminder: make sure to list all of your physical locations in Google Profile Manager.
Sidenote.
To Google, N.Y.C. is the same as New York. D.C. is, in this context, the same as Washington D.C. So you don’t need to list all of the popular abbreviations of cities or regions.
Do keyword research
Up to this point, we’ve got ideas on what to create content about from the lawyer’s perspective. Now let’s look at the searcher’s perspective.
From the perspective of a searcher, a keyword is a word or phrase that they type in Google to find things like local products or services.
This means that for us, keywords will become the topics of our content, blog posts, landing pages, etc., and/or things worth mentioning in our content. More importantly, they will be the drivers of organic search traffic.
Here are some keyword research ideas for lawyers.
Expand your services by analyzing other ranking pages
For this, you will need a keyword research tool like Ahrefs’ Keywords Explorer.
Go to the tool, type in a seed keyword like “corporate law,” and go to Related terms. The tool will show you keywords that other pages rank for and talk about while ranking for your seed keyword.
So for example, it may be worth targeting these keywords:
Look up specific competitors’ keywords
Some of your competitors will already be ahead of the SEO game, targeting lucrative keywords with their content. But that shouldn’t stop you from ranking for the same keywords (and even outranking the competition).
There are two methods for analyzing your competition in this scope.
The first one is done by plugging in your competitor’s domain in Ahrefs’ Site Explorer set to “subdomains.” This will show all of the keywords your competitor ranks for.
For a more manageable keyword list, you can also plug in a specific page’s URL (like the blog or practice areas) and/or use filters to display keywords by criteria like search volume, traffic potential, or keyword difficulty.
In the second method, you can look up a few competitors in one go. Go to the Content Gap report in Site Explorer, plug in your competitors, and leave the last input file open.
This will show you keywords where at least one of your competitors ranks in the top 10.
If you already have a live website, you can also insert your domain to see the keywords that your competitors rank for but you do not. For this, use the last input field for your domain.
Look even further
If you want to uncover more opportunities for driving organic search traffic, spend some more time in Keywords Explorer and browse through:
- Google autosuggestions.
- Common questions.
- Topics your competitors blog about.
For example, we can take our Also rank for report and make it show only keywords with questions by including words like “why,” “how,” “when,” etc., in the Include filter.
This way, we can uncover common questions related to areas of law like the one below. Note that the first five search results belong to law firms; it’s not uncommon to see law firms attracting visitors through education.
An important skill in keyword research is choosing and prioritizing keywords. To see how it’s done step by step, read this: Keyword Research: The Beginner’s Guide by Ahrefs.
Create optimized pages
Now that we know what to create content about, it’s time to learn how to create that content. So in this section, we’ll focus on optimizing the so-called on-page SEO factors: Things that you can include on your page or inside its HTML to improve its ranking and visibility on the SERPs.
Align with search intent
Search intent refers to the reason behind the search. It’s one of the strongest ranking factors.
The search intent of any given search query can be identified by looking at the SERPs and determining three things:
- Content type – Is the domination type a blog post, landing page, video, or free tool?
- Content format – Common formats include how-to guides, list posts, opinion pieces, definition posts, etc.
- Content angle – The unique selling point of the results, e.g., “in 2022” or “for beginners.”
For example, judging from the top-ranking pages and the “People Also Ask” box for “emancipation in new york,” it seems that Google thinks people want to know what that is.
So the best way to align with search intent is through an article that explains what emancipation is and maybe even explains the processes behind it.
To become proficient in optimizing for search intent, see our guide: What Is Search Intent? A Complete Guide for Beginners.
Create quality and up-to-date content
Google is getting better and better at understanding quality content. To give you a quick overview of its SEO guidelines, you should make your content:
- Easy to read – When writing about the law, you probably won’t be able to avoid jargon. But you can still explain it sufficiently and use simple sentences everyone (actually, even a 9-year-old) can understand.
- Clearly organized – Break text into sections with descriptive headings.
- Up to date – Crucial in law-related topics.
- Unique – You can take cues from the best-performing content but try to provide some unique value to your readers at the same time. For example, you can provide a unique content angle or include educational materials like an infographic. This is also the part where you want to consider adding link bait.
- Focused on providing essential information to solve a searcher’s problem – Longer content doesn’t mean that it’s of higher quality.
- Aligned with E-A-T guidelines – More on that in the next section.
Demonstrate E-A-T
E-A-T- stands for expertise, authoritativeness, and trustworthiness. It’s a concept taken from a guideline that Google Quality Raters (humans) use to help engineers improve Google’s algorithm. It means that Google wants to promote pages that demonstrate E-A-T, and it’s getting better at it.
E-A-T bears the most importance for YMYL topics (Your Money or Your Life). Surely, law is one of them.
Besides the quite obvious things like keeping your content accurate and up to date and citing your sources where necessary, flashing your credentials can be helpful too.
So create an About page introducing you and other lawyers in your firm and demonstrate why people should trust you. Mention things like education, bar admissions, affiliations, awards, etc.
Then make sure each article that you publish mentions the author and links to their About page.
Two other tactics that may help you with demonstrating your expertise, authoritativeness, and trustworthiness are:
- Using schema markup on pages where you introduce the lawyers – Schema markup is a simple code that helps Google better understand your content. You can learn how to apply it with this guide.
- Getting links from authoritative sources – I’ll explain some link building tactics later on in the article.
Recommended reading: What Is EAT? Why It’s Important for SEO
Optimize page titles and meta descriptions
Page titles and meta descriptions are important because the searchers can see them on the SERPs, and this can impact what they click on. Additionally, page titles are considered a “small ranking factor.”
Here’s what to take into account when crafting a page title:
- Make the title eye-catching and accurate – Write a line that piques users’ interest and accurately describes what’s unique about your content/offer.
- Insert the target keyword in your title – Make it sound natural to the reader. For your homepage title, make sure to include your company’s name.
- Fit within 60 characters
And here’s what’s important for meta descriptions:
- Make it compelling but not clickbaity
- Fit within 920 px – You can use a tool like SERPSim to help you with that.
- Synchronize the description with the title – The description can be an extension of or support what you claim in the title.
Use short and descriptive URLs
URLs are another “small” ranking factor. And you should optimize the URL with the user in mind. This means:
- Keep it short – Don’t use an overly nested structure. URLs should be an indication of the user’s location on a website.
- Make it human-readable – Use a few words that describe the page. Don’t use cryptic signs.
- Get an SSL certificate – This will show users that the connection is secure and private; they will see “HTTPS” at the beginning of your domain as a sign of secure connection in the browser. It’s also a lightweight ranking signal.
Here’s an example of a user-friendly URL that checks the above boxes. It comes from a subpage on art law—part of a content hub on entertainment law.
https://www.romanolaw.com/entertainment/art-law/
Recommended reading: How to Create SEO-Friendly URLs (Step-by-Step)
Add internal links
Internal links are the links to other pages on your website. You need them for a few reasons. They can:
- Provide a crawl path to target pages
- Boost other pages you own – This means they pass link equity. So pages that tend to get a lot of links can help other pages (where building links is harder) rank higher (see the “middleman” method).
- Help Google understand what the page is about – This is possible with the internal links’ anchor texts.
- Help users navigate your website
The content creation phase is the best time to include internal links. The three places you should consider when adding internal links are:
- Your money pages, i.e., the pages that describe your services or help visitors contact you. But don’t force it; add them when it’s a natural next step for the user.
- Other relevant articles on the topic.
- Related articles.
To find internal linking opportunities, you can use search operators in Google. Use the site:
operator together with a search term in quotation marks, like this:
Another way is to use the Link opportunities report in Ahrefs’ Site Audit. It focuses on the 10 best keywords for each page on your website and looks for mentions of those terms on your other pages.
Recommended reading: Here’s Why You Should Prioritize Internal Linking in 2022
Optimize images
Optimizing images for SEO is about these three things:
- Compressing image file size – You can use a plugin like ShortPixel or a bulk image optimizer like Kraken. This will help your website load faster and load speed counts for SEO (as shown in this case study).
- Using descriptive image file names
- Use descriptive alt texts – Together with file names, they help Google understand the context of your page. In addition, alt texts help visually impaired users.
Translate your content (for multilingual regions)
International law firms and lawyers working in multilingual regions who provide services in multiple languages should consider looking into translating their content. They should do so for at least the pages they want to rank for multilingual phrases, e.g., homepage, services, locations, and contact page.
Here’s why:
- Content in the same language as the search query is likely more relevant to that query.
- It helps with link building outreach in the same language.
- Translated content will be more accessible to the group of people speaking that language.
Multilingual SEO involves many details and technicalities, so let me point you to our guide on the topic: Multilingual SEO: Translation and Marketing Guide.
That concludes dealing with on-page factors. Now we can move to off-page factors, i.e., factors that occur outside the website.
Build links
Links from other websites are one of the most impactful ranking factors. The more good quality backlinks you have, the higher you can rank in the organic results.
You can get backlinks in two ways:
- Earn them organically through link-worthy content on your site
- Build them through link building methods (what I’ll be explaining in this section of the article)
Sidenote.
According to some SEOs, all backlinks can help you rank both in Google’s map pack and organic results. This actually makes sense if you read into the hints that Google gives us on how it determines local ranking:
Generally speaking, to improve your local rankings, prioritize those link opportunities that are at the same time contextually relevant, are locally relevant, and come from authoritative sources.
With all that out of the way, let’s look at some ideas on how lawyers and law firms can build relevant backlinks.
Publish press releases
Following an important case, it’s a good idea to issue a press release and distribute it online. Depending on the type of the case, it can gain interest from international, national, and local magazines alike.
One example of this is the Johnny Depp and Amber Heard case led by Camille Vasquez and Benjamin Chew from Brown Rudnick. As you can see below, that case earned that law firm follow links from 213 quality domains. Some are local, and some are national/international.
Some other ideas for press releases include:
- New hires.
- Mergers.
- Important company statements.
Look for newsjacking opportunities
Also called “reactive PR,” this technique is about providing reliable information on current events.
This requires regular monitoring of what’s happening in the world or your local area related to your law specialization. Here are two ways to do this and remain sane. You can:
- Hire someone, e.g., a local PR agency.
- Use a web monitoring tool like Google Alerts. If you’re an Ahrefs user, you can also use the Mentions tool.
Link from your publications, teaching, or public speaking events
Lawyers often have the opportunity to teach at universities and present lectures at conferences. Oftentimes, this will come with the possibility of including a link in the lecturer’s bio. It’s a great opportunity to earn a link from a domain with high authority (strong backlink profile) and local relevance, as in the example below.
Go after guest blogging opportunities
Guest blogging is a common link building practice. Yet the availability of opportunities varies depending on the topic. Below is an example guest post on TechCrunch about the legal issues with the startup credo “move fast and break things” that links back to the law firm of the authors.
Here’s how you can find and vet guest blogging opportunities using Ahrefs’ Content Explorer. You can:
- Type in law AND (“guest article” OR “guest post”) in the search bar. This will search our database for the word “law” and at least one of the two phrases “guest article” or “guest post.”
- Set the website traffic filter to “From 500” to filter out new websites and websites with potentially low quality.
- Turn on the “Only live” filter to weed out broken pages.
- Use the “One page per domain” option because we only want a single result from any website.
Here’s an example find. Note that you can instantly see metrics of each page, which can help you vet prospects.
Answer journalist requests
Services like HARO, ResponseSource, and SourceBottle allow you to track journalist requests for expert commentary on legal matters (or from a legal perspective). If your commentary appears in a newspaper or magazine, you benefit twofold: You earn a link and increase awareness of your law firm.
All you need to do is to sign up for their services, subscribe to topics that interest you, and wait for an email with the latest request. If something piques your interest, answer as soon as possible.
Additionally, you can follow the #journorequest hashtag on Twitter.
If you can, prioritize local news and magazines because those links will have local relevance that can help you rank for keywords with local intent.
Local rankings
Here’s the last thing on our menu: local rankings published by local magazines, blogs, or review sites.
Not to be confused with local listings and directories featured at the beginning of the article.
While “local rankings link building” is a sound tactic for any local business to pursue, I haven’t seen many of those opportunities in the law niche. Still, if that kind of opportunity knocks on your door, give it serious consideration. Just remember to evaluate it in terms of contextual relevance, local relevance, and authority.
That concludes the link building section. If you want to learn more about link building, see our detailed guides:
Next stop: how to stay on top of technical SEO and SEO tracking.
The “SEO health” of your website can impact your rankings or prevent you from showing up on Google’s. Here, we’re stepping into the territory of technical SEO: optimizing your website to help search engines find, crawl, understand, and index your pages. Fortunately, there are tools for that.
Tl;dr: The easiest way to keep your website’s SEO health in shape is to get a tool like Site Audit and fix any error it reports (also available for free in Ahrefs Webmaster Tools).
Looking into technical SEO issues is not something that will consume a lot of your time on a regular basis. Once you make sure your site is crawlable and indexable and fix any errors or warnings that may already be occurring on your site (e.g., broken links, slow page loading), it’s a matter of occasionally checking on the report.
For a deeper dive into the subject of technical SEO, check these out:
Tracking your progress “manually” on Google is not reliable because Google personalizes results based on factors like search history, device, and current location. Here are some tools you can use instead.
Starting from Google Business Profile, Google allows you to track a set of performance metrics for free within the service. For example, you can see queries people used to find your profile, the number of direction requests, or the number of people who viewed the profile.
You’ll also need a tool to track your Google Map Pack performance, e.g., the freemium Grid My Business or Local Falcon.
If you want to track all of your keyword rankings, try a tool like Ahrefs’ Rank Tracker. It lets you track up to 10,000 keyword rankings for “regular” organic search by country, state, city, and even ZIP/postal code.
Recommended reading: 10 SEO Metrics That Actually Matter (And 4 That Don’t)
Final thoughts
While SEO can bring you traffic that you don’t need to pay for, it’s worth noting that this marketing tactic takes time and effort. The more competitive the keywords you try to rank for, the more time it can take you to rank for them.
The first steps will probably be the hardest, so it may not be the best idea to bet everything on SEO just yet. But once you get the process up and running, you can use the same techniques over and over again for consistent results with compounding effects.
Got questions? Ping me on Twitter.
SEO
How Compression Can Be Used To Detect Low Quality Pages
The concept of Compressibility as a quality signal is not widely known, but SEOs should be aware of it. Search engines can use web page compressibility to identify duplicate pages, doorway pages with similar content, and pages with repetitive keywords, making it useful knowledge for SEO.
Although the following research paper demonstrates a successful use of on-page features for detecting spam, the deliberate lack of transparency by search engines makes it difficult to say with certainty if search engines are applying this or similar techniques.
What Is Compressibility?
In computing, compressibility refers to how much a file (data) can be reduced in size while retaining essential information, typically to maximize storage space or to allow more data to be transmitted over the Internet.
TL/DR Of Compression
Compression replaces repeated words and phrases with shorter references, reducing the file size by significant margins. Search engines typically compress indexed web pages to maximize storage space, reduce bandwidth, and improve retrieval speed, among other reasons.
This is a simplified explanation of how compression works:
- Identify Patterns:
A compression algorithm scans the text to find repeated words, patterns and phrases - Shorter Codes Take Up Less Space:
The codes and symbols use less storage space then the original words and phrases, which results in a smaller file size. - Shorter References Use Less Bits:
The “code” that essentially symbolizes the replaced words and phrases uses less data than the originals.
A bonus effect of using compression is that it can also be used to identify duplicate pages, doorway pages with similar content, and pages with repetitive keywords.
Research Paper About Detecting Spam
This research paper is significant because it was authored by distinguished computer scientists known for breakthroughs in AI, distributed computing, information retrieval, and other fields.
Marc Najork
One of the co-authors of the research paper is Marc Najork, a prominent research scientist who currently holds the title of Distinguished Research Scientist at Google DeepMind. He’s a co-author of the papers for TW-BERT, has contributed research for increasing the accuracy of using implicit user feedback like clicks, and worked on creating improved AI-based information retrieval (DSI++: Updating Transformer Memory with New Documents), among many other major breakthroughs in information retrieval.
Dennis Fetterly
Another of the co-authors is Dennis Fetterly, currently a software engineer at Google. He is listed as a co-inventor in a patent for a ranking algorithm that uses links, and is known for his research in distributed computing and information retrieval.
Those are just two of the distinguished researchers listed as co-authors of the 2006 Microsoft research paper about identifying spam through on-page content features. Among the several on-page content features the research paper analyzes is compressibility, which they discovered can be used as a classifier for indicating that a web page is spammy.
Detecting Spam Web Pages Through Content Analysis
Although the research paper was authored in 2006, its findings remain relevant to today.
Then, as now, people attempted to rank hundreds or thousands of location-based web pages that were essentially duplicate content aside from city, region, or state names. Then, as now, SEOs often created web pages for search engines by excessively repeating keywords within titles, meta descriptions, headings, internal anchor text, and within the content to improve rankings.
Section 4.6 of the research paper explains:
“Some search engines give higher weight to pages containing the query keywords several times. For example, for a given query term, a page that contains it ten times may be higher ranked than a page that contains it only once. To take advantage of such engines, some spam pages replicate their content several times in an attempt to rank higher.”
The research paper explains that search engines compress web pages and use the compressed version to reference the original web page. They note that excessive amounts of redundant words results in a higher level of compressibility. So they set about testing if there’s a correlation between a high level of compressibility and spam.
They write:
“Our approach in this section to locating redundant content within a page is to compress the page; to save space and disk time, search engines often compress web pages after indexing them, but before adding them to a page cache.
…We measure the redundancy of web pages by the compression ratio, the size of the uncompressed page divided by the size of the compressed page. We used GZIP …to compress pages, a fast and effective compression algorithm.”
High Compressibility Correlates To Spam
The results of the research showed that web pages with at least a compression ratio of 4.0 tended to be low quality web pages, spam. However, the highest rates of compressibility became less consistent because there were fewer data points, making it harder to interpret.
Figure 9: Prevalence of spam relative to compressibility of page.
The researchers concluded:
“70% of all sampled pages with a compression ratio of at least 4.0 were judged to be spam.”
But they also discovered that using the compression ratio by itself still resulted in false positives, where non-spam pages were incorrectly identified as spam:
“The compression ratio heuristic described in Section 4.6 fared best, correctly identifying 660 (27.9%) of the spam pages in our collection, while misidentifying 2, 068 (12.0%) of all judged pages.
Using all of the aforementioned features, the classification accuracy after the ten-fold cross validation process is encouraging:
95.4% of our judged pages were classified correctly, while 4.6% were classified incorrectly.
More specifically, for the spam class 1, 940 out of the 2, 364 pages, were classified correctly. For the non-spam class, 14, 440 out of the 14,804 pages were classified correctly. Consequently, 788 pages were classified incorrectly.”
The next section describes an interesting discovery about how to increase the accuracy of using on-page signals for identifying spam.
Insight Into Quality Rankings
The research paper examined multiple on-page signals, including compressibility. They discovered that each individual signal (classifier) was able to find some spam but that relying on any one signal on its own resulted in flagging non-spam pages for spam, which are commonly referred to as false positive.
The researchers made an important discovery that everyone interested in SEO should know, which is that using multiple classifiers increased the accuracy of detecting spam and decreased the likelihood of false positives. Just as important, the compressibility signal only identifies one kind of spam but not the full range of spam.
The takeaway is that compressibility is a good way to identify one kind of spam but there are other kinds of spam that aren’t caught with this one signal. Other kinds of spam were not caught with the compressibility signal.
This is the part that every SEO and publisher should be aware of:
“In the previous section, we presented a number of heuristics for assaying spam web pages. That is, we measured several characteristics of web pages, and found ranges of those characteristics which correlated with a page being spam. Nevertheless, when used individually, no technique uncovers most of the spam in our data set without flagging many non-spam pages as spam.
For example, considering the compression ratio heuristic described in Section 4.6, one of our most promising methods, the average probability of spam for ratios of 4.2 and higher is 72%. But only about 1.5% of all pages fall in this range. This number is far below the 13.8% of spam pages that we identified in our data set.”
So, even though compressibility was one of the better signals for identifying spam, it still was unable to uncover the full range of spam within the dataset the researchers used to test the signals.
Combining Multiple Signals
The above results indicated that individual signals of low quality are less accurate. So they tested using multiple signals. What they discovered was that combining multiple on-page signals for detecting spam resulted in a better accuracy rate with less pages misclassified as spam.
The researchers explained that they tested the use of multiple signals:
“One way of combining our heuristic methods is to view the spam detection problem as a classification problem. In this case, we want to create a classification model (or classifier) which, given a web page, will use the page’s features jointly in order to (correctly, we hope) classify it in one of two classes: spam and non-spam.”
These are their conclusions about using multiple signals:
“We have studied various aspects of content-based spam on the web using a real-world data set from the MSNSearch crawler. We have presented a number of heuristic methods for detecting content based spam. Some of our spam detection methods are more effective than others, however when used in isolation our methods may not identify all of the spam pages. For this reason, we combined our spam-detection methods to create a highly accurate C4.5 classifier. Our classifier can correctly identify 86.2% of all spam pages, while flagging very few legitimate pages as spam.”
Key Insight:
Misidentifying “very few legitimate pages as spam” was a significant breakthrough. The important insight that everyone involved with SEO should take away from this is that one signal by itself can result in false positives. Using multiple signals increases the accuracy.
What this means is that SEO tests of isolated ranking or quality signals will not yield reliable results that can be trusted for making strategy or business decisions.
Takeaways
We don’t know for certain if compressibility is used at the search engines but it’s an easy to use signal that combined with others could be used to catch simple kinds of spam like thousands of city name doorway pages with similar content. Yet even if the search engines don’t use this signal, it does show how easy it is to catch that kind of search engine manipulation and that it’s something search engines are well able to handle today.
Here are the key points of this article to keep in mind:
- Doorway pages with duplicate content is easy to catch because they compress at a higher ratio than normal web pages.
- Groups of web pages with a compression ratio above 4.0 were predominantly spam.
- Negative quality signals used by themselves to catch spam can lead to false positives.
- In this particular test, they discovered that on-page negative quality signals only catch specific types of spam.
- When used alone, the compressibility signal only catches redundancy-type spam, fails to detect other forms of spam, and leads to false positives.
- Combing quality signals improves spam detection accuracy and reduces false positives.
- Search engines today have a higher accuracy of spam detection with the use of AI like Spam Brain.
Read the research paper, which is linked from the Google Scholar page of Marc Najork:
Detecting spam web pages through content analysis
Featured Image by Shutterstock/pathdoc
SEO
New Google Trends SEO Documentation
Google Search Central published new documentation on Google Trends, explaining how to use it for search marketing. This guide serves as an easy to understand introduction for newcomers and a helpful refresher for experienced search marketers and publishers.
The new guide has six sections:
- About Google Trends
- Tutorial on monitoring trends
- How to do keyword research with the tool
- How to prioritize content with Trends data
- How to use Google Trends for competitor research
- How to use Google Trends for analyzing brand awareness and sentiment
The section about monitoring trends advises there are two kinds of rising trends, general and specific trends, which can be useful for developing content to publish on a site.
Using the Explore tool, you can leave the search box empty and view the current rising trends worldwide or use a drop down menu to focus on trends in a specific country. Users can further filter rising trends by time periods, categories and the type of search. The results show rising trends by topic and by keywords.
To search for specific trends users just need to enter the specific queries and then filter them by country, time, categories and type of search.
The section called Content Calendar describes how to use Google Trends to understand which content topics to prioritize.
Google explains:
“Google Trends can be helpful not only to get ideas on what to write, but also to prioritize when to publish it. To help you better prioritize which topics to focus on, try to find seasonal trends in the data. With that information, you can plan ahead to have high quality content available on your site a little before people are searching for it, so that when they do, your content is ready for them.”
Read the new Google Trends documentation:
Get started with Google Trends
Featured Image by Shutterstock/Luis Molinero
SEO
All the best things about Ahrefs Evolve 2024
Hey all, I’m Rebekah and I am your Chosen One to “do a blog post for Ahrefs Evolve 2024”.
What does that entail exactly? I don’t know. In fact, Sam Oh asked me yesterday what the title of this post would be. “Is it like…Ahrefs Evolve 2024: Recap of day 1 and day 2…?”
Even as I nodded, I couldn’t get over how absolutely boring that sounded. So I’m going to do THIS instead: a curation of all the best things YOU loved about Ahrefs’ first conference, lifted directly from X.
Let’s go!
OUR HUGE SCREEN
The largest presentation screen I’ve ever seen! #ahrefsevolve pic.twitter.com/oboiMFW1TN
— Patrick Stox (@patrickstox) October 24, 2024
This is the biggest presentation screen I ever seen in my life. It’s like iMax for SEO presentations. #ahrefsevolve pic.twitter.com/sAfZ1rtePx
— Suganthan Mohanadasan (@Suganthanmn) October 24, 2024
CONFERENCE VENUE ITSELF
It was recently named the best new skyscraper in the world, by the way.
The Ahrefs conference venue feels like being in inception. #AhrefsEvolve pic.twitter.com/18Yjai1Cej
— Suganthan Mohanadasan (@Suganthanmn) October 24, 2024
I’m in Singapore for @ahrefs Evolve this week. Keen to connect with people doing interesting work on the future of search / AI #ahrefsevolve pic.twitter.com/s00UkIbxpf
— Alex Denning (@AlexDenning) October 23, 2024
OUR AMAZING SPEAKER LINEUP – SUPER INFORMATIVE, USEFUL TALKS!
A super insightful explanation of how Google Search Ranking works #ahrefsevolve pic.twitter.com/Cd1VSET2Aj
— Amanda Walls (@amandajwalls) October 24, 2024
“would I even do this if Google didn’t exist?” – what a great question to assess if you actually have the right focus when creating content amazing presentation from @amandaecking at #AhrefsEvolve pic.twitter.com/a6OKbKxwiS
— Aleyda Solis ️ (@aleyda) October 24, 2024
Attending @CyrusShepard ‘s talk on WTF is Helpful Content in Google’s algorithm at #AhrefsEvolve
“Focus on people first content”
Super relevant for content creators who want to stay ahead of the ever evolving Google search curve! #SEOTalk #SEO pic.twitter.com/KRTL13SB0g
This is the first time I am listening to @aleyda and it is really amazing. Lot of insights and actionable information.
Thank you #aleyda for power packed presentation.#AhrefsEvolve @ahrefs #seo pic.twitter.com/Xe3A9MGfrr
— Jignesh Gohel (@jigneshgohel) October 25, 2024
— Parth Suba (@parthsuba77) October 24, 2024
@thinking_slows thoughts on AI content – “it’s very good if you want to be average”.
We can do a lot better and Ryan explains how. Love it @ahrefs #AhrefsEvolve pic.twitter.com/qFqWs6QBH5
— Andy Chadwick (@digitalquokka) October 24, 2024
A super insightful explanation of how Google Search Ranking works #ahrefsevolve pic.twitter.com/Cd1VSET2Aj
— Amanda Walls (@amandajwalls) October 24, 2024
This is the first time I am listening to @aleyda and it is really amazing. Lot of insights and actionable information.
Thank you #aleyda for power packed presentation.#AhrefsEvolve @ahrefs #seo pic.twitter.com/Xe3A9MGfrr
— Jignesh Gohel (@jigneshgohel) October 25, 2024
GREAT MUSIC
First time I’ve ever Shazam’d a track during SEO conference ambience…. and the track wasn’t even Shazamable! #AhrefsEvolve @ahrefs pic.twitter.com/ZDzJOZMILt
— Lily Ray (@lilyraynyc) October 24, 2024
AMAZING GOODIES
Ahrefs Evolveきました!@ahrefs @AhrefsJP #AhrefsEvolve pic.twitter.com/33EiejQPdX
— さくらぎ (@sakuragi_ksy) October 24, 2024
Aside from the very interesting topics, what makes this conference even cooler are the ton of awesome freebies
Kudos for making all of these happen for #AhrefsEvolve @ahrefs team pic.twitter.com/DGzk5FSTN8
— Krista Melgarejo (@kimelgarejo) October 24, 2024
Content Goblin and SEO alligator party stickers are definitely going on my laptop. @ahrefs #ahrefsevolve pic.twitter.com/QBsBuY5Yix
— Patrick Stox (@patrickstox) October 24, 2024
This is one of the best swag bags I’ve received at any conference!
Either @ahrefs actually cares or the other conference swag bags aren’t up to par w Ahrefs!#AhrefsEvolve pic.twitter.com/Yc9e6wZPHn— Moses Sanchez (@SanchezMoses) October 25, 2024
SELFIE BATTLE
Some background: Tim and Sam have a challenge going on to see who can take the most number of selfies with all of you. Last I heard, Sam was winning – but there is room for a comeback yet!
Got the rare selfie with both @timsoulo and @samsgoh #AhrefsEvolve
— Bernard Huang (@bernardjhuang) October 24, 2024
THAT BELL
Everybody’s just waiting for this one.
@timsoulo @ahrefs #AhrefsEvolve pic.twitter.com/6ypWaTGDDP
— Jinbo Liang (@JinboLiang) October 24, 2024
STICKER WALL
Viva la vida, viva Seo!
Awante Argentina loco!#AhrefsEvolve pic.twitter.com/sfhbI2kWSH
— Gaston Riera. (@GastonRiera) October 24, 2024
AND, OF COURSE…ALL OF YOU!
#AhrefsEvolve let’s goooooooooooo!!! pic.twitter.com/THtdvdtUyB
— Tim Soulo (@timsoulo) October 24, 2024
–
There’s a TON more content on LinkedIn – click here – but I have limited time to get this post up and can’t quite figure out how to embed LinkedIn posts so…let’s stop here for now. I’ll keep updating as we go along!
-
WORDPRESS7 days ago
The Ultimate WordPress Toolkit for Pros (59+ Must-Have Tools)
-
WORDPRESS6 days ago
Hostinger Review: Website Creation Made Easy
-
SEARCHENGINES7 days ago
Daily Search Forum Recap: October 25, 2024
-
AI2 days ago
How AI is Transforming SEO and What Website Owners Need to Know
-
SEARCHENGINES5 days ago
Google Ranking Movement, Sitelinks Search Box Going Away, Gen-AI In Bing & Google, Ad News & More
-
SEO7 days ago
All the best things about Ahrefs Evolve 2024
-
SEARCHENGINES4 days ago
Google Search Ranking Volatility October 26th & 27th & 23rd & 24th
-
WORDPRESS5 days ago
5 Most Profitable Online Businesses You Can Start Today for Free!
You must be logged in to post a comment Login