SEO
How & Why To Prevent Bots From Crawling Your Site
For the most part, bots and spiders are relatively harmless.
You want Google’s bot, for example, to crawl and index your website.
However, bots and spiders can sometimes be a problem and provide unwanted traffic.
This kind of unwanted traffic can result in:
- Obfuscation of where the traffic is coming from.
- Confusing and hard to understand reports.
- Misattribution in Google Analytics.
- Increased bandwidth costs that you pay for.
- Other nuisances.
There are good bots and bad bots.
Good bots run in the background, seldom attacking another user or website.
Bad bots break the security behind a website or are used as a wide, large-scale botnet to deliver DDOS attacks against a large organization (something that a single machine cannot take down).
Here’s what you should know about bots and how to prevent the bad ones from crawling your site.
What Is A Bot?
Looking at exactly what a bot is can help identify why we need to block it and keep it from crawling our site.
A bot, short for “robot,” is a software application designed to repeat a specific task repeatedly.
For many SEO professionals, utilizing bots goes along with scaling an SEO campaign.
“Scaling” means you automate as much work as possible to get better results faster.
Common Misconceptions About Bots
You may have run into the misconception that all bots are evil and must be banned unequivocally from your site.
But this could not be further from the truth.
Google is a bot.
If you block Google, can you guess what will happen to your search engine rankings?
Some bots can be malicious, designed to create fake content or posing as legit websites to steal your data.
However, bots are not always malicious scripts run by bad actors.
Some can be great tools that help make work easier for SEO professionals, such as automating common repetitive tasks or scraping useful information from search engines.
Some common bots SEO professionals use are Semrush and Ahrefs.
These bots scrape useful data from the search engines, help SEO pros automate and complete tasks, and can help make your job easier when it comes to SEO tasks.
Why Would You Need to Block Bots From Crawling Your Site?
While there are many good bots, there are also bad bots.
Bad bots can help steal your private data or take down an otherwise operating website.
We want to block any bad bots we can uncover.
It’s not easy to discover every bot that may crawl your site but with a little bit of digging, you can find malicious ones that you don’t want to visit your site anymore.
So why would you need to block bots from crawling your website?
Some common reasons why you may want to block bots from crawling your site could include:
Protecting Your Valuable Data
Perhaps you found that a plugin is attracting a number of malicious bots who want to steal your valuable consumer data.
Or, you found that a bot took advantage of a security vulnerability to add bad links all over your site.
Or, someone keeps trying to spam your contact form with a bot.
This is where you need to take certain steps to protect your valuable data from getting compromised by a bot.
Bandwidth Overages
If you get an influx of bot traffic, chances are your bandwidth will skyrocket as well, leading to unforeseen overages and charges you would rather not have.
You absolutely want to block the offending bots from crawling your site in these cases.
You don’t want a situation where you’re paying thousands of dollars for bandwidth you don’t deserve to be charged for.
What’s bandwidth?
Bandwidth is the transfer of data from your server to the client-side (web browser).
Every time data is sent over a connection attempt you use bandwidth.
When bots access your site and you waste bandwidth, you could incur overage charges from exceeding your monthly allotted bandwidth.
You should have been given at least some detailed information from your host when you signed up for your hosting package.
Limiting Bad Behavior
If a malicious bot somehow started targeting your site, it would be appropriate to take steps to control this.
For example, you would want to ensure that this bot would not be able to access your contact forms. You want to make sure the bot can’t access your site.
Do this before the bot can compromise your most critical files.
By ensuring your site is properly locked down and secure, it is possible to block these bots so they don’t cause too much damage.
How To Block Bots From Your Site Effectively
You can use two methods to block bots from your site effectively.
The first is through robots.txt.
This is a file that sits at the root of your web server. Usually, you may not have one by default, and you would have to create one.
These are a few highly useful robots.txt codes that you can use to block most spiders and bots from your site:
Disallow Googlebot From Your Server
If, for some reason, you want to stop Googlebot from crawling your server at all, the following code is the code you would use:
User-agent: Googlebot
Disallow: /
You only want to use this code to keep your site from being indexed at all.
Don’t use this on a whim!
Have a specific reason for making sure you don’t want bots crawling your site at all.
For example, a common issue is wanting to keep your staging site out of the index.
You don’t want Google crawling the staging site and your real site because you are doubling up on your content and creating duplicate content issues as a result.
Disallowing All Bots From Your Server
If you want to keep all bots from crawling your site at all, the following code is the one you will want to use:
User-agent: *
Disallow: /
This is the code to disallow all bots. Remember our staging site example from above?
Perhaps you want to exclude the staging site from all bots before fully deploying your site to all of them.
Or perhaps you want to keep your site private for a time before launching it to the world.
Either way, this will keep your site hidden from prying eyes.
Keeping Bots From Crawling a Specific Folder
If for some reason, you want to keep bots from crawling a specific folder that you want to designate, you can do that too.
The following is the code you would use:
User-agent: *
Disallow: /folder-name/
There are many reasons someone would want to exclude bots from a folder. Perhaps you want to ensure that certain content on your site isn’t indexed.
Or maybe that particular folder will cause certain types of duplicate content issues, and you want to exclude it from crawling entirely.
Either way, this will help you do that.
Common Mistakes With Robots.txt
There are several mistakes that SEO professionals make with robots.txt. The top common mistakes include:
- Using both disallow in robots.txt and noindex.
- Using the forward slash / (all folders down from root), when you really mean a specific URL.
- Not including the correct path.
- Not testing your robots.txt file.
- Not knowing the correct name of the user-agent you want to block.
Using Both Disallow In Robots.txt And Noindex On The Page
Google’s John Mueller has stated you should not be using both disallow in robots.txt and noindex on the page itself.
If you do both, Google cannot crawl the page to see the noindex, so it could potentially still index the page anyway.
This is why you should only use one or the other, and not both.
Using The Forward Slash When You Really Mean A Specific URL
The forward slash after Disallow means “from this root folder on down, completely and entirely for eternity.”
Every page on your site will be blocked forever until you change it.
One of the most common issues I find in website audits is that someone accidentally added a forward slash to “Disallow:” and blocked Google from crawling their entire site.
Not Including The Correct Path
We understand. Sometimes coding robots.txt can be a tough job.
You couldn’t remember the exact correct path initially, so you went through the file and winging it.
The problem is that these similar paths all result in 404s because they are one character off.
This is why it’s important always to double-check the paths you use on specific URLs.
You don’t want to run the risk of adding a URL to robots.txt that isn’t going to work in robots.txt.
Not Knowing The Correct Name Of The User-Agent
If you want to block a particular user-agent but you don’t know the name of that user-agent, that’s a problem.
Rather than using the name you think you remember, do some research and figure out the exact name of the user-agent that you need.
If you are trying to block specific bots, then that name becomes extremely important in your efforts.
Why Else Would You Block Bots And Spiders?
There are other reasons SEO pros would want to block bots from crawling their site.
Perhaps they are deep into gray hat (or black hat) PBNs, and they want to hide their private blog network from prying eyes (especially their competitors).
They can do this by utilizing robots.txt to block common bots that SEO professionals use to assess their competition.
For example Semrush and Ahrefs.
If you wanted to block Ahrefs, this is the code to do so:
User-agent: AhrefsBot
Disallow: /
This will block AhrefsBot from crawling your entire site.
If you want to block Semrush, this is the code to do so.
There are also other instructions here.
There are a lot of lines of code to add, so be careful when adding these:
To block SemrushBot from crawling your site for different SEO and technical issues:
User-agent: SiteAuditBot
Disallow: /To block SemrushBot from crawling your site for Backlink Audit tool:
User-agent: SemrushBot-BA
Disallow: /To block SemrushBot from crawling your site for On Page SEO Checker tool and similar tools:
User-agent: SemrushBot-SI
Disallow: /To block SemrushBot from checking URLs on your site for SWA tool:
User-agent: SemrushBot-SWA
Disallow: /To block SemrushBot from crawling your site for Content Analyzer and Post Tracking tools:
User-agent: SemrushBot-CT
Disallow: /To block SemrushBot from crawling your site for Brand Monitoring:
User-agent: SemrushBot-BM
Disallow: /To block SplitSignalBot from crawling your site for SplitSignal tool:
User-agent: SplitSignalBot
Disallow: /To block SemrushBot-COUB from crawling your site for Content Outline Builder tool:
User-agent: SemrushBot-COUB
Disallow: /
Using Your HTACCESS File To Block Bots
If you are on an APACHE web server, you can utilize your site’s htaccess file to block specific bots.
For example, here is how you would use code in htaccess to block ahrefsbot.
Please note: be careful with this code.
If you don’t know what you are doing, you could bring down your server.
We only provide this code here for example purposes.
Make sure you do your research and practice on your own before adding it to a production server.
Order Allow,Deny
Deny from 51.222.152.133
Deny from 54.36.148.1
Deny from 195.154.122
Allow from all
For this to work properly, make sure you block all the IP ranges listed in this article on the Ahrefs blog.
If you want a comprehensive introduction to .htaccess, look no further than this tutorial on Apache.org.
If you need help using your htaccess file to block specific types of bots, you can follow the tutorial here.
Blocking Bots and Spiders Can Require Some Work
But it’s well worth it in the end.
By making sure you block bots and spiders from crawling your site, you don’t fall into the same trap as others.
You can rest easy knowing your site is immune to certain automated processes.
When you can control these particular bots, it makes things that much better for you, the SEO professional.
If you have to, always make sure that block the required bots and spiders from crawling your site.
This will result in enhanced security, a better overall online reputation, and a much better site that will be there in the years to come.
More resources:
Featured Image: Roman Samborskyi/Shutterstock
!function(f,b,e,v,n,t,s)
{if(f.fbq)return;n=f.fbq=function(){n.callMethod?
n.callMethod.apply(n,arguments):n.queue.push(arguments)};
if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version=’2.0′;
n.queue=[];t=b.createElement(e);t.async=!0;
t.src=v;s=b.getElementsByTagName(e)[0];
s.parentNode.insertBefore(t,s)}(window,document,’script’,
‘https://connect.facebook.net/en_US/fbevents.js’);
if( typeof sopp !== “undefined” && sopp === ‘yes’ ){
fbq(‘dataProcessingOptions’, [‘LDU’], 1, 1000);
}else{
fbq(‘dataProcessingOptions’, []);
}
fbq(‘init’, ‘1321385257908563’);
fbq(‘track’, ‘PageView’);
fbq(‘trackSingle’, ‘1321385257908563’, ‘ViewContent’, {
content_name: ‘prevent-bot-crawling’,
content_category: ‘technical-seo web-development’
});
SEO
Google Updating Cryptocurrency Advertising Policy For 2024

Google published an announcement of upcoming changes to their cryptocurrency advertising policies and advises advertisers to make themselves aware of the changes and prepare to be in compliance with the new requirements.
The upcoming updates are to Google’s Cryptocurrencies and related products policy for the advertisement of Cryptocurrency Coin Trusts. The changes are set to take effect on January 29th, 2024.
Cryptocurrency Coin Trusts are financial products that enable investors to trade shares in trusts holding substantial amounts of digital currency. These trusts provide investors with equity in cryptocurrencies without having direct ownership. They are also an option for creating a more diversified portfolio.
The policy updates by Google that are coming in 2024 aim to describe the scope and requirements for the advertisement of Cryptocurrency Coin Trusts. Advertisers targeting the United States will be able to promote these products and services as long as they abide by specific policies outlined in the updated requirements and that they also obtain certification from Google.
The updated policy changes are not limited to the United States. They will apply globally to all accounts advertising Cryptocurrency Coin Trusts.
Google’s announcement also reminded advertisers of their obligation for compliance to local laws in the areas where the ads are targeted.
Google’s approach for violations of the new policy will be to first give a warning before imposing an account suspension.
Advertisers that fail to comply with the updated policy will receive a warning at least seven days before a potential account suspension. This time period provides advertisers with an opportunity to fix non-compliance issues and to get back into compliance with the revised guidelines.
Advertisers are encouraged to refer to Google’s documentation on “About restricted financial products certification.”
The deadline for the change in policy is January 29th, 2024. Cryptocurrency Coin Trusts advertisers will need to pay close attention to the updated policies in order to ensure compliance.
Read Google’s announcement:
Updates to Cryptocurrencies and related products policy (December 2023)
SEO
SEO Trends You Can’t Ignore In 2024

Most SEO trends fade quickly. But some of them stick and deserve your attention.
Let’s explore what those are and how to take advantage of them.
If you give ChatGPT a title and ask it to write a blog post, it will—in seconds.
This is super impressive, but there are a couple of issues:
- Everyone else using ChatGPT is creating the same content. It’s the same for users of other GPT-powered AI writing tools, too—which is basically all of them.
- The content is extremely dull. Sure, you can ask ChatGPT to “make it more entertaining,” but it usually overcompensates and hands back a cringe version of the same boring content.
In the words of Gael Breton:
How to take advantage of this SEO trend
Don’t use AI to write entire articles. They’ll be boring as heck. Instead, use it as a creative sparring partner to help you write better content and automate monotonous tasks.
For example, you can ask ChatGPT To write an outline from a working title and a list of keywords (which you can pull from Ahrefs)—and it does a pretty decent job.
Prompt:
Create an outline for a post entitled “[working title]” based on these keywords: [list]
Result:


When you’ve written your draft, you can ask to polish it in seconds by asking ChatGPT to proofread it.


Then you can automate the boring stuff, like creating more enticing title tags…


… and writing a meta description:


If you notice a few months down the line that your content ranks well but hasn’t won the featured snippet, ChatGPT can help with that, too.
For example, Ahrefs tells us we rank in position 3 for “affiliate marketing” but don’t own the snippet.


If we check Google, the snippet is a definition. Asking ChatGPT to simplify our definition may solve this problem.


In short, there are a near-infinite number of ways to use ChatGPT (and other AI writing tools) to create better content. And all of them buck the trend of asking it to write boring, boilerplate articles from scratch.
Programmatic SEO refers to the creation of keyword-targeted pages in an automatic (or near automatic) way.
Nomadlist’s location pages are a perfect example:


Each page focuses on a specific city and shares the same core information—internet speeds, cost, temperature, etc. All of this information is pulled programmatically from a database and the site gets an estimated 46k monthly search visits in total.


Programmatic SEO is nothing new. It’s been around forever. It’s just the hot thing right now because AI tools like ChatGPT make it easier and more accessible than ever before.
The problem? As John Mueller pointed out on Twitter X, much of it is spam:
I love fire, but also programmatic SEO is often a fancy banner for spam.
— I am John – ⭐ Say no to cookies – biscuits only ⭐ (@JohnMu) July 25, 2023
How to take advantage of this SEO trend
Don’t use programmatic SEO to publish insane amounts of spam that’ll probably get hit in the next Google update. Use it to scale valuable content that will stand the test of time.
For example, Wise’s currency conversion pages currently get an estimated 31.7M monthly search visits:


This is because the content is actually useful. Each page features an interactive tool showing the live exchange rate for any amount…


… the exchange rate over time…


… a handy email notification option when the exchange rates exceed a certain amount…


… handy conversion charts for popular amounts…


… and a comparison of the cheapest ways to send money abroad in your chosen currency:


It doesn’t matter that all of these pages use the same template. The data is exactly what you want to see when you search [currency 1] to [currency 2]
.
That’s probably why Wise ranks in the top 10 for over 66,000 of these keywords:


Looking to take advantage of programmatic content in 2024 like Wise? Check out the guide below.
People love ChatGPT because it answers questions fast and succinctly, so it’s no surprise that generative AI is already making its way into search.
For example, if you ask Bing for a definition or how to do something basic, AI will generate an answer on the fly right there in the search results.




In other words, thanks to AI, users no longer have to click on a search result for answers to simple questions. It’s like featured snippets on steroids.
This might not be a huge deal right now, but when Google’s version of this (Search Generative Experience) comes out of beta, many websites will see clicks fall off a cliff.
How to take advantage of this SEO trend
Don’t invest too much in topics that generative AI can easily answer. You’ll only lose clicks like crazy to AI in the long run. Instead, start prioritizing topics that AI will struggle to answer.
How do you know which topics it will struggle to answer? Try asking ChatGPT. If it gives a good and concise answer, it’s clearly an easy question.
For example, there are hundreds of searches for how to calculate a percentage in Google Sheets every month in the US:


If you ask ChatGPT for the solution, it gives you a perfect answer in about fifty words.


This is the perfect example of a topic where generative AI will remove the need to click on a search result for many.
That’s probably not going to be the case for a topic like this:


Sure. Generative AI might be able to tell you how to create a template—but it can’t make one for you. And even if it can in the future, it will never be a personal finance expert with experience. You’ll always have to click on a search result for a template created by that person.
These are the kinds of topics to prioritize in 2024 and beyond.
Sidenote.
None of this means you should stop targeting “simple” topics altogether. You’ll always be able to get some traffic from them. My point is not to be obsessed with ranking for keywords whose days are numbered. Prioritize topics with long-term value instead.
Bonus: 3 SEO trends to ignore in 2024
Not all SEO trends move the needle. Here are just a few of those trends and why you should ignore them.
People are using voice search more than ever
In 2014, Google revealed that 41% of Americans use voice search daily. According to research by UpCity, that number was up to 50% as of 2022. I haven’t seen any data for 2023 yet, but I’d imagine it’s above 50%.
Why you should ignore this SEO trend
75% of voice search results come from a page ranking in the top 3, and 40.7% come from a featured snippet. If you’re already optimizing for those things, there’s not much more you can do.
People are using visual search for shopping more than ever
In 2022, Insider Intelligence reported that 22% of US adults have shopped with visual search (Google Lens, Bing Visual Search, etc.). That number is up from just 15% in 2021.
Why you should ignore this SEO trend
Much like voice search, there’s no real way to optimize for visual search. Sure, it helps to have good quality product images, optimized filenames and alt text, and product schema markup on your pages—but you should be doing this stuff anyway as it’s been a best practice since forever.
People are using Bing more than ever before
Bing’s Yusuf Mehdi announced in March 2023 that the search engine had surpassed 100M daily active users for the first time ever. This came just one month after the launch of AI-powered Bing.
Why you should ignore this SEO trend
Bing might be more popular than ever, but its market share still only stands at around ~3% according to estimates by Statcounter. Google’s market share stands at roughly 92%, so that’s the one you should be optimizing for.
Plus, it’s often the case that if you rank in Google, you also rank in Bing—so it really doesn’t deserve any focus.
Final thoughts
Keeping your finger on the pulse and taking advantage of trends makes sense, but don’t let them distract you from the boring stuff that’s always worked: find what people are searching for > create content about it > build backlinks > repeat.
Got questions? Ping me on Twitter X.
SEO
Mozilla VPN Security Risks Discovered

Mozilla published the results of a recent third-party security audit of its VPN services as part of it’s commitment to user privacy and security. The survey revealed security issues which were presented to Mozilla to be addressed with fixes to ensure user privacy and security.
Many search marketers use VPNs during the course of their business especially when using a Wi-Fi connection in order to protect sensitive data, so the trustworthiness of a VNP is essential.
Mozilla VPN
A Virtual Private Network (VPN), is a service that hides (encrypts) a user’s Internet traffic so that no third party (like an ISP) can snoop and see what sites a user is visiting.
VPNs also add a layer of security from malicious activities such as session hijacking which can give an attacker full access to the websites a user is visiting.
There is a high expectation from users that the VPN will protect their privacy when they are browsing on the Internet.
Mozilla thus employs the services of a third party to conduct a security audit to make sure their VPN is thoroughly locked down.
Security Risks Discovered
The audit revealed vulnerabilities of medium or higher severity, ranging from Denial of Service (DoS). risks to keychain access leaks (related to encryption) and the lack of access controls.
Cure53, the third party security firm, discovered and addressed several risks. Among the issues were potential VPN leaks to the vulnerability of a rogue extension that disabled the VPN.
The scope of the audit encompassed the following products:
- Mozilla VPN Qt6 App for macOS
- Mozilla VPN Qt6 App for Linux
- Mozilla VPN Qt6 App for Windows
- Mozilla VPN Qt6 App for iOS
- Mozilla VPN Qt6 App for Androi
These are the risks identified by the security audit:
- FVP-03-003: DoS via serialized intent
- FVP-03-008: Keychain access level leaks WG private key to iCloud
- VP-03-010: VPN leak via captive portal detection
- FVP-03-011: Lack of local TCP server access controls
- FVP-03-012: Rogue extension can disable VPN using mozillavpnnp (High)
The rogue extension issue was rated as high severity. Each risk was subsequently addressed by Mozilla.
Mozilla presented the results of the security audit as part of their commitment to transparency and to maintain the trust and security of their users. Conducting a third party security audit is a best practice for a VPN provider that helps assure that the VPN is trustworthy and reliable.
Read Mozilla’s announcement:
Mozilla VPN Security Audit 2023
Featured Image by Shutterstock/Meilun
-
SEO6 days ago
GPT Store Set To Launch In 2024 After ‘Unexpected’ Delays
-
SEARCHENGINES6 days ago
Google Core Update Done Followed By Intense Search Volatility, New Structured Data, Google Ads Head Steps Down & 20 Years Covering Search
-
PPC6 days ago
How to Get Clients for Your Agency (That You’ll Love Working With)
-
TECHNOLOGY7 days ago
Next-gen chips, Amazon Q, and speedy S3
-
SEARCHENGINES5 days ago
Google Discover Showing Older Content Since Follow Feature Arrived
-
WORDPRESS1 day ago
8 Best Zapier Alternatives to Automate Your Website
-
MARKETING6 days ago
The Complete Guide to Becoming an Authentic Thought Leader
-
SOCIAL4 days ago
Paris mayor to stop using ‘global sewer’ X
You must be logged in to post a comment Login