Connect with us

SEO

6 Common Robots.txt Issues & And How To Fix Them

Published

on

6 Common Robots.txt Issues & And How To Fix Them

Robots.txt is a useful and relatively powerful tool to instruct search engine crawlers on how you want them to crawl your website.

It is not all-powerful (in Google’s own words, “it is not a mechanism for keeping a web page out of Google”) but it can help to prevent your site or server from being overloaded by crawler requests.

If you have this crawl block in place on your site, you need to be certain it’s being used properly.

This is particularly important if you use dynamic URLs or other methods that generate a theoretically infinite number of pages.

In this guide, we will look at some of the most common issues with the robots.txt file, the impact they can have on your website and your search presence, and how to fix these issues if you think they have occurred.

Advertisement

But first, let’s take a quick look at robots.txt and its alternatives.

What Is Robots.txt?

Robots.txt uses a plain text file format and is placed in the root directory of your website.

It must be in the topmost directory of your site; if you place it in a subdirectory, search engines will simply ignore it.

Despite its great power, robots.txt is often a relatively simple document, and a basic robots.txt file can be created in a matter of seconds using an editor like Notepad.

There are other ways to achieve some of the same goals that robots.txt is usually used for.

Individual pages can include a robots meta tag within the page code itself.

Advertisement

You can also use the X-Robots-Tag HTTP header to influence how (and whether) content is shown in search results.

What Can Robots.txt do?

Robots.txt can achieve a variety of results across a range of different content types:

Web pages can be blocked from being crawled.

They may still appear in search results, but will not have a text description. Non-HTML content on the page will not be crawled either.

Media files can be blocked from appearing in Google search results.

This includes images, video, and audio files.

Advertisement

If the file is public, it will still ‘exist’ online and can be viewed and linked to, but this private content will not show in Google searches.

Resource files like unimportant external scripts can be blocked.

But this means if Google crawls a page that requires that resource to load, the Googlebot robot will ‘see’ a version of the page as if that resource did not exist, which may affect indexing.

You cannot use robots.txt to completely block a web page from appearing in Google’s search results.

To achieve that, you must use an alternative method such as adding a noindex meta tag to the head of the page.

How Dangerous Are Robots.txt Mistakes?

A mistake in robots.txt can have unintended consequences, but it’s often not the end of the world.

Advertisement

The good news is that by fixing your robots.txt file, you can recover from any errors quickly and (usually) in full.

Google’s guidance to web developers says this on the subject of robots.txt mistakes:

“Web crawlers are generally very flexible and typically will not be swayed by minor mistakes in the robots.txt file. In general, the worst that can happen is that incorrect [or] unsupported directives will be ignored.

Bear in mind though that Google can’t read minds when interpreting a robots.txt file; we have to interpret the robots.txt file we fetched. That said, if you are aware of problems in your robots.txt file, they’re usually easy to fix.”

6 Common Robots.txt Mistakes

  1. Robots.txt Not In The Root Directory.
  2. Poor Use Of Wildcards.
  3. Noindex In Robots.txt.
  4. Blocked Scripts And Stylesheets.
  5. No Sitemap URL.
  6. Access To Development Sites.

If your website is behaving strangely in the search results, your robots.txt file is a good place to look for any mistakes, syntax errors, and overreaching rules.

Let’s take a look at each of the above mistakes in more detail and see how to ensure you have a valid robots.txt file.

1. Robots.txt Not In The Root Directory

Search robots can only discover the file if it’s in your root folder.

That’s why there should be only a forward slash between the .com (or equivalent domain) of your website, and the ‘robots.txt’ filename, in the URL of your robots.txt file.

Advertisement

If there’s a subfolder in there, your robots.txt file is probably not visible to the search robots, and your website is probably behaving as if there was no robots.txt file at all.

To fix this issue, move your robots.txt file to your root directory.

It’s worth noting that this will need you to have root access to your server.

Some content management systems will upload files to a ‘media’ subdirectory (or something similar) by default, so you might need to circumvent this to get your robots.txt file in the right place.

2. Poor Use Of Wildcards

Robots.txt supports two wildcard characters:

  • Asterisk * which represents any instances of a valid character, like a Joker in a deck of cards.
  • Dollar sign $ which denotes the end of a URL, allowing you to apply rules only to the final part of the URL, such as the filetype extension.

It’s sensible to adopt a minimalist approach to using wildcards, as they have the potential to apply restrictions to a much broader portion of your website.

It’s also relatively easy to end up blocking robot access from your entire site with a poorly placed asterisk.

Advertisement

To fix a wildcard issue, you’ll need to locate the incorrect wildcard and move or remove it so that your robots.txt file performs as intended.

3. Noindex In Robots.txt

This one is more common in websites that are more than a few years old.

Google has stopped obeying noindex rules in robots.txt files as of September 1, 2019.

If your robots.txt file was created before that date, or contains noindex instructions, you’re likely to see those pages indexed in Google’s search results.

The solution to this problem is to implement an alternative ‘noindex’ method.

One option is the robots meta tag, which you can add to the head of any web page you want to prevent Google from indexing.

Advertisement

4. Blocked Scripts And Stylesheets

It might seem logical to block crawler access to external JavaScripts and cascading stylesheets (CSS).

However, remember that Googlebot needs access to CSS and JS files in order to “see” your HTML and PHP pages correctly.

If your pages are behaving oddly in Google’s results, or it looks like Google is not seeing them correctly, check whether you are blocking crawler access to required external files.

A simple solution to this is to remove the line from your robots.txt file that is blocking access.

Or, if you have some files you do need to block, insert an exception that restores access to the necessary CSS and JavaScripts.

5. No Sitemap URL

This is more about SEO than anything else.

Advertisement

You can include the URL of your sitemap in your robots.txt file.

Because this is the first place Googlebot looks when it crawls your website, this gives the crawler a headstart in knowing the structure and main pages of your site.

While this is not strictly an error, as omitting a sitemap should not negatively affect the actual core functionality and appearance of your website in the search results, it’s still worth adding your sitemap URL to robots.txt if you want to give your SEO efforts a boost.

6. Access To Development Sites

Blocking crawlers from your live website is a no-no, but so is allowing them to crawl and index your pages that are still under development.

It’s best practice to add a disallow instruction to the robots.txt file of a website under construction so the general public doesn’t see it until it’s finished.

Equally, it’s crucial to remove the disallow instruction when you launch a completed website.

Advertisement

Forgetting to remove this line from robots.txt is one of the most common mistakes among web developers, and can stop your entire website from being crawled and indexed correctly.

If your development site seems to be receiving real-world traffic, or your recently launched website is not performing at all well in search, look for a universal user agent disallow rule in your robots.txt file:

User-Agent: *


Disallow: /

If you see this when you shouldn’t (or don’t see it when you should), make the necessary changes to your robots.txt file and check that your website’s search appearance updates accordingly.

How To Recover From A Robots.txt Error

If a mistake in robots.txt is having unwanted effects on your website’s search appearance, the most important first step is to correct robots.txt and verify that the new rules have the desired effect.

Some SEO crawling tools can help with this so you don’t have to wait for the search engines to next crawl your site.

When you are confident that robots.txt is behaving as desired, you can try to get your site re-crawled as soon as possible.

Platforms like Google Search Console and Bing Webmaster Tools can help.

Advertisement

Submit an updated sitemap and request a re-crawl of any pages that have been inappropriately delisted.

Unfortunately, you are at the whim of Googlebot – there’s no guarantee as to how long it might take for any missing pages to reappear in the Google search index.

All you can do is take the correct action to minimize that time as much as possible and keep checking until the fixed robots.txt is implemented by Googlebot.

Final Thoughts

Where robots.txt errors are concerned, prevention is definitely better than cure.

On a large revenue-generating website, a stray wildcard that removes your entire website from Google can have an immediate impact on earnings.

Edits to robots.txt should be made carefully by experienced developers, double-checked, and – where appropriate – subject to a second opinion.

Advertisement

If possible, test in a sandbox editor before pushing live on your real-world server to ensure you avoid inadvertently creating availability issues.

Remember, when the worst happens, it’s important not to panic.

Diagnose the problem, make the necessary repairs to robots.txt, and resubmit your sitemap for a new crawl.

Your place in the search rankings will hopefully be restored within a matter of days.

More resources:


Featured Image: M-SUR/Shutterstock

Advertisement




Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address

SEO

Google’s Search Engine Market Share Drops As Competitors’ Grows

Published

on

By

Assorted search engine apps including Google, You.com and Bing are seen on an iPhone. Microsoft plans to use ChatGPT in Bing, and You.com has launched an AI chatbot.

According to data from GS Statcounter, Google’s search engine market share has fallen to 86.99%, the lowest point since the firm began tracking search engine share in 2009.

The drop represents a more than 4% decrease from the previous month, marking the largest single-month decline on record.

Screenshot from: https://gs.statcounter.com/search-engine-market-share/, May 2024.

U.S. Market Impact

The decline is most significant in Google’s key market, the United States, where its share of searches across all devices fell by nearly 10%, reaching 77.52%.

1714669058 226 Googles Search Engine Market Share Drops As Competitors GrowsScreenshot from: https://gs.statcounter.com/search-engine-market-share/, May 2024.

Concurrently, competitors Microsoft Bing and Yahoo Search have seen gains. Bing reached a 13% market share in the U.S. and 5.8% globally, its highest since launching in 2009.

Yahoo Search’s worldwide share nearly tripled to 3.06%, a level not seen since July 2015.

1714669058 375 Googles Search Engine Market Share Drops As Competitors GrowsScreenshot from: https://gs.statcounter.com/search-engine-market-share/, May 2024.

Search Quality Concerns

Many industry experts have recently expressed concerns about the declining quality of Google’s search results.

A portion of the SEO community believes that the search giant’s results have worsened following the latest update.

Advertisement

These concerns have begun to extend to average internet users, who are increasingly voicing complaints about the state of their search results.

Alternative Perspectives

Web analytics platform SimilarWeb provided additional context on X (formerly Twitter), stating that its data for the US for March 2024 suggests Google’s decline may not be as severe as initially reported.

SimilarWeb also highlighted Yahoo’s strong performance, categorizing it as a News and Media platform rather than a direct competitor to Google in the Search Engine category.

Why It Matters

The shifting search engine market trends can impact businesses, marketers, and regular users.

Google has been on top for a long time, shaping how we find things online and how users behave.

However, as its market share drops and other search engines gain popularity, publishers may need to rethink their online strategies and optimize for multiple search platforms besides Google.

Users are becoming vocal about Google’s declining search quality over time. As people start trying alternate search engines, the various platforms must prioritize keeping users satisfied if they want to maintain or grow their market position.

It will be interesting to see how they respond to this boost in market share.

What It Means for SEO Pros

As Google’s competitors gain ground, SEO strategies may need to adapt by accounting for how each search engine’s algorithms and ranking factors work.

Advertisement

This could involve diversifying SEO efforts across multiple platforms and staying up-to-date on best practices for each one.

The increased focus on high-quality search results emphasizes the need to create valuable, user-focused content that meets the needs of the target audience.

SEO pros must prioritize informative, engaging, trustworthy content that meets search engine algorithms and user expectations.

Remain flexible, adaptable, and proactive to navigate these shifts. Keeping a pulse on industry trends, user behaviors, and competing search engine strategies will be key for successful SEO campaigns.


Featured Image: Tada Images/Shutterstock



Source link

Advertisement
Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

How To Drive Pipeline With A Silo-Free Strategy

Published

on

By

How To Drive Pipeline With A Silo-Free Strategy

When it comes to B2B strategy, a holistic approach is the only approach. 

Revenue organizations usually operate with siloed teams, and often expect a one-size-fits-all solution (usually buying clicks with paid media). 

However, without cohesive brand, infrastructure, and pipeline generation efforts, they’re pretty much doomed to fail. 

It’s just like rowing crew, where each member of the team must synchronize their movements to propel the boat forward – successful B2B marketing requires an integrated strategy. 

So if you’re ready to ditch your disjointed marketing efforts and try a holistic approach, we’ve got you covered.

Advertisement

Join us on May 15, for an insightful live session with Digital Reach Agency on how to craft a compelling brand and PMF. 

We’ll walk through the critical infrastructure you need, and the reliances and dependences of the core digital marketing disciplines.

Key takeaways from this webinar:

  • Thinking Beyond Traditional Silos: Learn why traditional marketing silos are no longer viable and how they spell doom for modern revenue organizations.
  • How To Identify and Fix Silos: Discover actionable strategies for pinpointing and sealing the gaps in your marketing silos. 
  • The Power of Integration: Uncover the secrets to successfully integrating brand strategy, digital infrastructure, and pipeline generation efforts.

Ben Childs, President and Founder of Digital Reach Agency, and Jordan Gibson, Head of Growth at Digital Reach Agency, will show you how to seamlessly integrate various elements of your marketing strategy for optimal results.

Don’t make the common mistake of using traditional marketing silos – sign up now and learn what it takes to transform your B2B go-to-market.

You’ll also get the opportunity to ask Ben and Jordan your most pressing questions, following the presentation.

And if you can’t make it to the live event, register anyway and we’ll send you a recording shortly after the webinar. 

Advertisement

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

Why Big Companies Make Bad Content

Published

on

Why Big Companies Make Bad Content

It’s like death and taxes: inevitable. The bigger a company gets, the worse its content marketing becomes.

HubSpot teaching you how to type the shrug emoji or buy bitcoin stock. Salesforce sharing inspiring business quotes. GoDaddy helping you use Bing AI, or Zendesk sharing catchy sales slogans.

Judged by content marketing best practice, these articles are bad.

They won’t resonate with decision-makers. Nobody will buy a HubSpot license after Googling “how to buy bitcoin stock.” It’s the very definition of vanity traffic: tons of visits with no obvious impact on the business.

So why does this happen?

Advertisement
I did a double-take the first time I discovered this article on the HubSpot blog.

There’s an obvious (but flawed) answer to this question: big companies are inefficient.

As companies grow, they become more complicated, and writing good, relevant content becomes harder. I’ve experienced this firsthand:

  • extra rounds of legal review and stakeholder approval creeping into processes.
  • content watered down to serve an ever-more generic “brand voice”.
  • growing misalignment between search and content teams.
  • a lack of content leadership within the company as early employees leave.
Why Big Companies Make Bad ContentWhy Big Companies Make Bad Content
As companies grow, content workflows can get kinda… complicated.

Similarly, funded companies have to grow, even when they’re already huge. Content has to feed the machine, continually increasing traffic… even if that traffic never contributes to the bottom line.

There’s an element of truth here, but I’ve come to think that both these arguments are naive, and certainly not the whole story.

It is wrong to assume that the same people that grew the company suddenly forgot everything they once knew about content, and wrong to assume that companies willfully target useless keywords just to game their OKRs.

Instead, let’s assume that this strategy is deliberate, and not oversight. I think bad content—and the vanity traffic it generates—is actually good for business.

Advertisement

There are benefits to driving tons of traffic, even if that traffic never directly converts. Or put in meme format:

Why Big Companies Make Bad ContentWhy Big Companies Make Bad Content

Programmatic SEO is a good example. Why does Dialpad create landing pages for local phone numbers?

1714584366 91 Why Big Companies Make Bad Content1714584366 91 Why Big Companies Make Bad Content

Why does Wise target exchange rate keywords?

1714584366 253 Why Big Companies Make Bad Content1714584366 253 Why Big Companies Make Bad Content

Why do we have a list of most popular websites pages?

1714584367 988 Why Big Companies Make Bad Content1714584367 988 Why Big Companies Make Bad Content

As this Twitter user points out, these articles will never convert…

…but they don’t need to.

Every published URL and targeted keyword is a new doorway from the backwaters of the internet into your website. It’s a chance to acquire backlinks that wouldn’t otherwise exist, and an opportunity to get your brand in front of thousands of new, otherwise unfamiliar people.

These benefits might not directly translate into revenue, but over time, in aggregate, they can have a huge indirect impact on revenue. They can:

Advertisement
  • Strengthen domain authority and the search performance of every other page on the website.
  • Boost brand awareness, and encourage serendipitous interactions that land your brand in front of the right person at the right time.
  • Deny your competitors traffic and dilute their share of voice.

These small benefits become more worthwhile when multiplied across many hundreds or thousands of pages. If you can minimize the cost of the content, there is relatively little downside.

What about topical authority?

“But what about topical authority?!” I hear you cry. “If you stray too far from your area of expertise, won’t rankings suffer for it?”

I reply simply with this screenshot of Forbes’ “health” subfolder, generating almost 4 million estimated monthly organic pageviews:

1714584367 695 Why Big Companies Make Bad Content1714584367 695 Why Big Companies Make Bad Content

And big companies can minimize cost. For large, established brands, the marginal cost of content creation is relatively low.

Many companies scale their output through networks of freelancer writers, avoiding the cost of fully loaded employees. They have established, efficient processes for research, briefing, editorial review, publication and maintenance. The cost of an additional “unit” of content—or ten, or a hundred—is not that great, especially relative to other marketing channels.

There is also relatively little opportunity cost to consider: the fact that energy spent on “vanity” traffic could be better spent elsewhere, on more business-relevant topics.

Advertisement

In reality, many of the companies engaging in this strategy have already plucked the low-hanging fruit and written almost every product-relevant topic. There are a finite number of high traffic, high relevance topics; blog consistently for a decade and you too will reach these limits.

On top of that, the HubSpots and Salesforces of the world have very established, very efficient sales processes. Content gating, lead capture and scoring, and retargeting allow them to put very small conversion rates to relatively good use.

1714584367 376 Why Big Companies Make Bad Content1714584367 376 Why Big Companies Make Bad Content

Even HubSpot’s article on Bitcoin stock has its own relevant call-to-action—and for HubSpot, building a database of aspiring investors is more valuable than it sounds, because…

The bigger a company grows, the bigger its audience needs to be to continue sustaining that growth rate.

Companies generally expand their total addressable market (TAM) as they grow, like HubSpot broadening from marketing to sales and customer success, launching new product lines for new—much bigger—audiences. This means the target audience for their content marketing grows alongside.

As Peep Laja put its:

Advertisement

But for the biggest companies, this principle is taken to an extreme. When a company gears up to IPO, its target audience expands to… pretty much everyone.

This was something Janessa Lantz (ex-HubSpot and dbt Labs) helped me understand: the target audience for a post-IPO company is not just end users, but institutional investors, market analysts, journalists, even regular Jane investors.

These are people who can influence the company’s worth in ways beyond simply buying a subscription: they can invest or encourage others to invest and dramatically influence the share price. These people are influenced by billboards, OOH advertising and, you guessed it, seemingly “bad” content showing up whenever they Google something.

Advertisement

You can think of this as a second, additional marketing funnel for post-IPO companies:

Illustration: When companies IPO, the traditional marketing funnel is accompanied by a second funnel. Website visitors contribute value through stock appreciation, not just revenue.Illustration: When companies IPO, the traditional marketing funnel is accompanied by a second funnel. Website visitors contribute value through stock appreciation, not just revenue.

These visitors might not purchase a software subscription when they see your article in the SERP, but they will notice your brand, and maybe listen more attentively the next time your stock ticker appears on the news.

They won’t become power users, but they might download your eBook and add an extra unit to the email subscribers reported in your S1.

They might not contribute revenue now, but they will in the future: in the form of stock appreciation, or becoming the target audience for a future product line.

Vanity traffic does create value, but in a form most content marketers are not used to measuring.

If any of these benefits apply, then it makes sense to acquire them for your company—but also to deny them to your competitors.

Advertisement

SEO is an arms race: there are a finite number of keywords and topics, and leaving a rival to claim hundreds, even thousands of SERPs uncontested could very quickly create a headache for your company.

SEO can quickly create a moat of backlinks and brand awareness that can be virtually impossible to challenge; left unchecked, the gap between your company and your rival can accelerate at an accelerating pace.

Pumping out “bad” content and chasing vanity traffic is a chance to deny your rivals unchallenged share of voice, and make sure your brand always has a seat at the table.

Final thoughts

These types of articles are miscategorized—instead of thinking of them as bad content, it’s better to think of them as cheap digital billboards with surprisingly great attribution.

Big companies chasing “vanity traffic” isn’t an accident or oversight—there are good reasons to invest energy into content that will never convert. There is benefit, just not in the format most content marketers are used to.

This is not an argument to suggest that every company should invest in hyper-broad, high-traffic keywords. But if you’ve been blogging for a decade, or you’re gearing up for an IPO, then “bad content” and the vanity traffic it creates might not be so bad.

Advertisement



Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

Trending

Follow by Email
RSS