Connect with us

SEO

How to Block, Scrapers, Hackers and Spammers with Wordfence

Published

on

How to Block, Scrapers, Hackers and Spammers with Wordfence

Wordfence is a popular WordPress security plugin. Among the features are scanner that monitors for hacked files and a firewall with regularly updated rules that proactively blocks malicious bots.

There’s also a useful feature tucked away in the tool that makes user-configurable firewall rules available that can supercharge your ability to block hackers, scrapers and spammers.

Scrapers are especially troublesome because they copy your content and publish it elsewhere.

Using a tool like Wordfence can help reduce the amount of content that scrapers can plagiarize.

There are many WordPress security plugins and SaaS solutions to choose from that are highly recommended, including Sucuri Security and Cloudflare. Wordfence is one of many security solutions available and it’s up to you to figure out which feels more comfortable within your workflow.

Wordfence and other solutions function fine as a set it and forget it solution.

However, in my experience I have found that the user configurable firewall in Wordfence gives one an opportunity to dial up the bot hammering power and really stick it to the hackers and scrapers.

But before you dial up the firewall it’s important to know how far these firewall rules can be taken and we’ll take a look at that, too.

Wordfence WordPress Security

Wordfence is trusted by over 4 million users for protecting their WordPress sites.

The default Firewall behavior is to block bots that grab too many pages too fast or bots and humans that display activities that signal an intent to hack the site.

The firewall will block the IP address of the rogue bot for a set period of time, after which Wordfence drops the block.

The default settings on the firewall works great.

But sometimes bots still get through and are able to scrape a site or probe it for vulnerabilities by scraping the site slowly.

A common approach by hackers is to set a bot to hit the site quickly and when it gets blocked it will rotate to other IP addresses and user agents, which causes a firewall to start the detection process all over again.

But these bots aren’t always programmed very well which makes it easy to block them more efficiently than with the default Wordfence settings.

Background Information About Wordfence Firewall Rules

It’s possible to accomplish efficient bot blocking with server level tools, multiple plugins and even by the use of an .htaccess file.

But editing an .htaccess file can be tricky because there are strict rules to follow and a mistake in the .htaccess file can cause the entire site to fail.

Using firewall rules is simply an easier way to block bots.

What Can You Block With Wordfence?

Wordfence allows you to create rules to block according to each of the following reasons:

  • IP Address Range
  • Hostname
  • Browser User Agent
  • Referrer

IP Address Range

IP address means the IP address of the server or ISP that the bot or human is coming from.

Hostname

Hostname means the name of the host. The host isn’t always declared, sometimes the bot/human visitor displays just an IP address.

Browser User Agent

Every site visitor generally tells the server what browser it is using. Browser User Agent means the browser that the visitor says it’s using.  A bot can say it’s virtually any browser, which they sometimes do in order to evade detection.

Referrer

This is a page that a bot or human supposedly clicked a link from.

Wordfence Custom Pattern Blocking

The way to block bad bots using any of the above four variables is by adding a custom rule in the Custom Pattern Blocking tool.

Here’s how to reach it.

Step 1

Click the link to the Firewall from the left side admin menu in WordPress

Step 2

Choose the tab labeled Blocking

Wordfence step 2

Step 3

Choose the “Custom Pattern” tab and create a firewall rule in the appropriate field. One of the fields is labeled “Block Reason.” Use that field to add a descriptive phrase like Hostname, User Agent or whatever. It will help you to review all rules you create by being able to sort by what kind of block it is.

Wordfence step 3

Step 4

Wordfence step 4

Step 5

Make your rule by clicking the “Block Visitors Matching This Pattern” button and you’re done.

Wordfence step 5

Wordfence rules can use the asterisk (*) as a wild card.

Should You Block IP Addresses with Wordfence?

Wordfence makes it easy for a publisher to set up firewall rules that efficiently blocks bots.

That’s a blessing but it can also be a curse. For example, permanently blocking thousands of IP addresses using Wordfence firewall is not efficient and probably not a proper use of Wordfence.

Temporarily blocking IP addresses is fine. Permanently blocking IP addresses probably not fine because, as I understand it, going by memory, this can bloat or slow down your WordPress installation.

In general, permanently blocking thousands or even millions of IP addresses is best accomplished with an .htaccess file.

Hostname Blocking with Wordfence

Blocking a hostname with Wordfence can be a way to block hackers, spammers and scrapers. By clicking Wordfence > Tools you can view the Wordfence Live Traffic log.

That shows you bot and human visitors, including bots that were blocked automatically by Wordfence.

Not all site visitors display their hostname. However in some cases they do display their hostname and that makes it easy to block an entire web host.

For example, one site, for whatever reason, attracts DDOS levels of bot traffic from a single host. None of my other sites attracts that much attention from this host, just this one site.

Between March 2020 and December 2021 that one site received over 250,000 attacks and every single one of them was blocked by Wordfence.

Clearly, blocking bots by hostname can be useful if you want to block a cloud host that sends nothing but hackers and scrapers.

However some hosts, like Amazon Web Services (AWS) send both bad bots and good bots. Blocking AWS servers can also inadvertently block good bots.

So it’s important to monitor you’re traffic and be absolutely certain that blocking a hostname will not backfire.

On the other hand, if you have no use for traffic from Russia or China, then it’s easy to block hackers, scrapers and spammers from those two countries by creating a firewall rule using the hostname field.

All you have to do is create a rule that blocks all hostnames that end in .ru and .cn. That will block all Russian and Chinese hostnames that end in .ru and .cn.

This is what you enter into the Hostname field:

*.ru
*.cn

This is not meant to encourage anyone to use Wordfence to block Russian and Chinese bots via the hostname. It’s just an example to show how it’s done.

Block Hackers and Scrapers By User Agent

Many rogue bots use old and out of date browser user agents.

After Russia invaded Ukraine I noticed an increase in hacking bots using the Chrome 90 user agent (UA) from the same group of web hosts. Normally bot traffic is different across the different websites. So this stood out when they all looked the same across all of my sites.

Whenever Wordfence automatically blocked these bots for hitting my site too fast the bots would switch IP address and begin hitting the sites over and over again.

So I decided to block these bots by their Browser User Agent (often referred to as simply, UA).

First I checked the StatCounter website to determine how many users around the world are using Chrome 90. According to the StatCounter statistics, Chrome 90 browser share as of January 2022 stood at 0.09% market share in the USA.

At the time of this writing the Chrome browser is at version 100. Considering that Chrome automatically updates browser versions for the vast majority of users it’s not surprising that the usage of Chrome 90 is virtually nothing, so it’s very  unlikely that blocking all visitors using a Chrome 90 browser user agent will not block an actual and legit person visiting your site.

So I determined that it’s safe to block anything that shows up to my site with the Chrome 90 user agent.

However, there are online tools, like GTMetrix and a security server header checker, that use the Chrome 90 user agent.

So if I blocked all versions of Chrome 90 (by using this rule: *Chrome/90.*), I would also block those two online tools.

Another way to do is to look at the specific Chrome 90 variants used by the hackers and the online tools.

GTMetrix and the other tool use this Chrome UA:

Chrome/90.0.4430.212

Hackers and scrapers use these Chrome UAs:

Chrome/90.0.4400.8
Chrome/90.0.4427.0
Chrome/90.0.4430.72
Chrome/90.0.4430.85
Chrome/90.0.4430.86
Chrome/90.0.4430.93

So, if you want to allow the online tools to still scan your site but also block the bad bots, this is an example of how to do it:

*Chrome/90.0.4400.8*
*Chrome/90.0.4427.0*
*Chrome/90.0.4430.72*
*Chrome/90.0.4430.85*
*Chrome/90.0.4430.86*
*Chrome/90.0.4430.93*

This is how to block Chrome/90.0.4430.93:

How to block Chrome 90 with Wordfence

Caveat About Blocking User Agents

Before blocking Chrome 90 I kept checking the Wordfence traffic log (accessible at Wordfence > Tools) in order to be sure that no legit bots, like GTMetrix, are using Chrome 90 was using that user agent.

For example, you might not want to block Chrome 96 because some of Google’s tools use Chrome 96 as a user agent.

Always research whether legitimate bots are using a particular user agent or hostname.

And easy way to research that is by using the Wordfence Traffic Log.

Wordfence Traffic Log

The Wordfence traffic log shows you at a glance all user agents accessing your site in near real-time. The traffic log shows information such as user agent, indicates whether the visitor is a bot or a human, provides the IP address, hostname, the page being accessed and other information that helps determine if a visitor is legit or not.

The way to access the traffic log is by clicking Wordfence > Tools.

Blocking old browser versions is an easy way to block a lot of bad bots.  Chrome versions from the 80, 70, 60, 50, 30 and 40 series are particularly numerous on some sites.

Here’s an example of how to block old Chrome UAs that are  used by bad bots:

*Chrome/8*.*
*Chrome/7*.*
*Chrome/6*.*
*Chrome/5.0*
*Chrome/95.*
*Chrome/5*.*
*Chrome/3*.*
*Chrome/4*.*

Again, the above is not an encouragement to block the above bots.

The reason I would use *Chrome/6*.* is because with a single rule I can block the entire Chrome 60 series of user agents, Chrome 60, 61, 63, etc., without having to write all ten user agents.

I can block the entire 60 series with a single rule.

Do not block the ten and up series like this *Chrome/1*.* because that will also block the most current version of Chrome, Chrome 100.

The above is an example of how to block bad bots using the described Chrome user agents.

Bad bots also use old and retired Firefox browser user agents and some even display python-requests/ as a user agent.

Be Careful When Creating Firewall Rules

Always do your research first to determine what bad bots are using on your own sites and make sure that no legitimate bots or site visitors are using those old and retired browser user agents.

The way to do your research is by inspecting your traffic log files or the Wordfence traffic logs to determine which user agents (or hostnames) are from malicious traffic that you don’t want.




Source link

SEO

Everything You Need To Know

Published

on

Of all the many, many functions available in Google Ads, I have a few that are my favorites. And sitelink assets – previously known as sitelink extensions – are at the top of my list.

Why? Because they’re so versatile. You can do almost anything with them if you think through your strategy carefully.

For example, you can use the mighty sitelink in your advertising to:

  • Promote low search volume themes.
  • Push lagging products out the door.
  • Maximize hot sellers.
  • Highlight certain product categories.
  • Answer common questions.
  • Handle PR problems.

And that’s just a start! Sitelink assets can almost do it all.

Best Practices For Using Sitelink Assets Extensions

If you truly want to get the most out of your sitelinks, you need to think about your intention.

To help you with that, I’m going to lay out a few sitelink guidelines.

1. Get clear on your objectives. Before you start, you need to think about your goals. What are you trying to achieve with these assets? Are you advertising products or services? Will the asset work well with both branded and non-branded keywords? Your answers to these questions will help determine if your sitelinks are versatile and useful to the searcher.

2. Use sitelinks as part of your larger strategy. Don’t think of your sitelinks in isolation. You should also consider the accompanying ad, landing page, and other assets. Make sure they all work together in service to your overarching strategy.

3. Use a mix of sitelinks. Sitelinks can serve multiple purposes, so make sure you’re using a variety. For example, you don’t want to use every sitelink on an ad to promote on-sale products. Instead, use a mix. One could promote an on-sale product, one could generate leads, one could highlight a new product category, and one could direct prospective clients to useful information.

4. Create landing pages for your sitelinks. Ideally, you want to send users to landing pages that tightly correlate with your sitelink instead of just a regular page on your website.

5. Track sitelink performance and adjust. It’s not enough to set up sitelinks. You should also track them to see which links are getting traction and which ones are not. This doesn’t mean that all sitelinks should perform equally (more on this below), but it does mean they should perform well given their type and objectives.

Why it’s Better To Use A Mix Of Sitelink Assets

Let’s dive deeper into this idea of using a mix of sitelinks by looking at an example.

In a new client account, we created four different types of sitelinks:

  • Two sitelinks are product-focused (as requested by the client).
  • One sitelink connects users with an engineer to learn more about the product (“Speak to an Engineer”). It has more of a sales focus.
  • One sitelink allows users to learn more about the products without speaking to an engineer (“What is?”).

The “What is?” sitelink is outperforming the “Speak to an Engineer” sitelink when we measure by CTR. While we need more data before making any changes, I predict we’ll eventually swap out the sales-y “Speak to an Engineer” sitelink for something else.

The fact that the educational link (“What is?”) is performing better than the sales-y link (“Speak to an Engineer”) isn’t too surprising in this case. The product is a new, cutting-edge robot that not many people are aware of, yet. They want more info before talking to someone.

Screenshot by author, January 2023

By using a mix of sitelinks, and assessing the performance of each, we gained a lot of valuable information that is helping to guide our strategy for this account. So going with a mix of sitelinks is always a good idea. You never know what you’ll discover!

Sitelink Assets Examples

Now, let’s look at some specific examples of sitelink assets in Google Ads.

Example 1: Chromatography

Sitelinks extension - Chromatography exampleScreenshot from Google, January 2023

Application Search: This ad is for a highly technical product that can be used in a wide variety of applications. (Chromatography is a laboratory technique for separating mixtures.) So putting “application search” in a sitelink here might make sense. It helps prospective clients find what they’re looking for.

Sign up and Save Big: A good sitelink for lead generation and potential revenue.

Technical Support: I’m not a big fan of putting technical support in sitelinks. Tech support seems more targeted to current users rather than prospective users. But who knows, maybe they really do want to help current users get tech support via their advertising.

Guides and Posters: Again, this sitelink is a bit unusual, but it might be appropriate for this product. Perhaps people are downloading branded posters and posting them in their workplaces. If so, it’s a great way to build brand awareness.

Example 2: Neuroscience Courses

Sitelink Extensions - Nueroscience courses exampleScreenshot from Google, January 2023

I love everything about these sitelinks! The advertising is using them to reach people in all phases of the buyer journey.

For people not ready to commit:

  • Study Neuroscience: This sitelink is broad and informational. It’s helpful to people who have just started to explore their options for studying neuroscience.
  • Get Course Brochure: This sitelink is also great for people in the research phase. And while we mostly live in an online world, some people still prefer to consume hard-copy books, brochures, etc. With this sitelink, the school is covering its bases.

For people getting close to committing:

  • Online Short Course: This is the course the school offers. It’s a great sitelink for those almost ready to sign up.

For people ready to sign up:

  • Register Online Now: This is the strongest call to action for those ready to commit. It takes people directly to the signup page.

Example 3: Neuroscience Degrees

Let’s look at another example from the world of neuroscience education: this time for a neuroscience degree program.

Sitelink extensions - neuroscience degree exampleScreenshot from Google, January 2023

In contrast to the previous two examples, the sitelinks in this ad aren’t as strong.

Academics Overview: This sitelink seems more appropriate for a broad term search, such as a search on the school’s name. If the searcher is looking for a specific degree program (which seems like the intention based on the term and the ad), the sitelinks should be something specific to that particular degree program.

Scholarships: Just as with the above sitelink, “Scholarships” doesn’t seem very helpful either. The topic of scholarships is important—but probably doesn’t need to be addressed until the person determines that this school is a good fit.

Example 4: Code Security

Next, let’s look at two Google search ads for code security products.

Sitelink extensions - code security exampleScreenshot from Google, January 2023

 

The sitelinks in these two ads look like typical assets you’d find for SaaS, cloud-based, or tech companies. They click through to a lot of helpful information, such as product plans and success stories.

I particularly like the Most Common Risks sitelink in the second ad. It leads to a helpful article that would be great for engaging top-of-funnel leads.

On the flip side, I’m not a big fan of the Blog sitelink in the first ad. “Blog” simply isn’t very descriptive or helpful.

Still, there are no right or wrong sitelinks here. And it would be interesting to test my theory that blog content is not a top-performing asset!

Sitelink Assets Are More Than An Afterthought

I hope I’ve convinced you of the usefulness and versatility of sitelinks when created with specific objectives that align with your broader strategy.

So don’t create your sitelink assets as an afterthought.

Because if you give them the careful consideration they deserve, they’ll serve you well.

Note: Google sitelink assets were previously known as sitelink extensions and renamed in September 2022.

More resources:


Featured Image: Thaspol Sangsee/Shutterstock



Source link

Continue Reading

SEO

AI Content In Search Results

Published

on

AI Content In Search Results

Google has released a statement regarding its approach to AI-generated content in search results.

The company has a long-standing policy of rewarding high-quality content, regardless of whether humans or machines produce it.

Above all, Google’s ranking systems aim to identify content that demonstrates expertise, experience, authoritativeness, and trustworthiness (E-E-A-T).

Google advises creators looking to succeed in search results to produce original, high-quality, people-first content that demonstrates E-E-A-T.

The company has updated its “Creating helpful, reliable, people-first content” help page with guidance on evaluating content in terms of “Who, How, and Why.”

Here’s how AI-generated content fits into Google’s approach to ranking high-quality content in search results.

Quality Over Production Method

Focusing on the quality of content rather than the production method has been a cornerstone of Google’s approach to ranking search results for many years.

A decade ago, there were concerns about the rise in mass-produced human-generated content.

Rather than banning all human-generated content, Google improved its systems to reward quality content.

Google’s focus on rewarding quality content, regardless of production method, continues to this day through its ranking systems and helpful content system introduced last year.

Automation & AI-Generated Content

Using automation, including AI, to generate content with the primary purpose of manipulating ranking in search results violates Google’s spam policies.

Google’s spam-fighting efforts, including its SpamBrain system, will continue to combat such practices.

However, Google realizes not all use of automation and AI-generated content is spam.

For example, publishers automate helpful content such as sports scores, weather forecasts, and transcripts.

Google says it will continue to take a responsible approach toward AI-generated content while maintaining a high bar for information quality and helpfulness in search results.

Google’s Advice For Publishers

For creators considering AI-generated content, here’s what Google advises.

Google’s concept of E-E-A-T is outlined in the “Creating helpful, reliable, people-first content” help page, which has been updated with additional guidance.

The updated help page asks publishers to think about “Who, How, and Why” concerning how content is produced.

“Who” refers to the person who created the content, and it’s important to make this clear by providing a byline or background information about the author.

“How” relates to the method used to create the content, and it’s helpful to readers to know if automation or AI was involved. If AI was involved in the content production process, Google wants you to be transparent and explain why it was used.

“Why” refers to the purpose of creating content, which should be to help people rather than to manipulate search rankings.

Evaluating your content in this way, regardless of whether AI-generated or not, will help you stay in line with what Google’s systems reward.


Featured Image: Alejandro Corral Mena/Shutterstock



Source link

Continue Reading

SEO

Seven tips to optimize page speed in 2023

Published

on

Tips-to-optimize-page-speed-in-2023

30-second summary:

  • There has been a gradual increase in Google’s impact of page load time on website rankings
  • Google has introduced the three Core Web Vitals metrics as ranking factors to measure user experience
  • The following steps can help you get a better idea of the performance of your website through multiple tests

A fast website not only delivers a better experience but can also increase conversion rates and improve your search engine rankings. Google has introduced the three Core Web Vitals metrics to measure user experience and is using them as a ranking factor.

Let’s take a look at what you can do to test and optimize the performance of your website.

Start in Google Search Console

Want to know if optimizing Core Web Vitals is something you should be thinking about? Use the page experience report in Google Search Console to check if any of the pages on your website are loading too slowly.

Search Console shows data that Google collects from real users in Chrome, and this is also the data that’s used as a ranking signal. You can see exactly what page URLs need to be optimized.

Optimize-to-Start-in-Google-Search-Console

Run a website speed test

Google’s real user data will tell you how fast your website is, but it won’t provide an analysis that explains why your website is slow.

Run a free website speed test to find out. Simply enter the URL of the page you want to test. You’ll get a detailed performance report for your website, including recommendations on how to optimize it.

Run-a-website-speed-test-for-optimization

Use priority hints to optimize the Largest Contentful Paint

Priority Hints are a new browser feature that came out in 2022. It allows website owners to indicate how important an image or other resource is on the page.

This is especially important when optimizing the Largest Contentful Paint, one of the three Core Web Vitals metrics. It measures how long it takes for the main page content to appear after opening the page.

By default, browsers assume that all images are low priority until the page starts rendering and the browser knows which images are visible to the user. That way bandwidth isn’t wasted on low-priority images near the bottom of the page or in the footer. But it also slows down important images at the top of the page.

Adding a fetchpriority=”high” attribute to the img element that’s responsible for the Largest Contentful Paint ensures that it’s downloaded quickly.

Use native image lazy loading for optimization

Image lazy loading means only loading images when they become visible to the user. It’s a great way to help the browser focus on the most important content first.

However, image lazy loading can also slow cause images to take longer to load, especially when using a JavaScript lazy loading library. In that case, the browser first needs to load the JavaScript library before starting to load images. This long request chain means that it takes a while for the browser to load the image.

Use-native-image-lazy-loading-for-optimization

Today browsers support native lazy loading with the loading=”lazy” attribute for images. That way you can get the benefits of lazy loading without incurring the cost of having to download a JavaScript library first.

Remove and optimize render-blocking resources

Render-blocking resources are network requests that the browser needs to make before it can show any page content to the user. They include the HTML document, CSS stylesheets, as well as some JavaScript files.

Since these resources have such a big impact on page load time you should check each one to see if it’s truly necessary. The async keyword on the HTML script tag lets you load JavaScript code without blocking rendering.

If a resource has to block rendering check if you can optimize the request to load the resource more quickly, for example by improving compression or loading the file from your main web server instead of from a third party.

Remove-and-optimize-render-blocking-resources

Optimize with the new interaction to Next Paint metric

Google has announced a new metric called Interaction to Next Paint. This metric measures how quickly your site responds to user input and is likely to become one of the Core Web Vitals in the future.

You can already see how your website is doing on this metric using tools like PageSpeed Insights.

Optimize-with-new-Interaction-to-Next-Paint-metric

Continuously monitor your site performance

One-off site speed tests can identify performance issues on your website, but they don’t make it easy to keep track of your test results and confirm that your optimizations are working.

DebugBear continuously monitors your website to check and alerts you when there’s a problem. The tool also makes it easy to show off the impact of your work to clients and share test results with your team.

Try DebugBear with a free 14-day trial.

Continuously-monitor-your-site-performance

 

Source link

Continue Reading

Trending

en_USEnglish