Connect with us


Does Google Have A Problem With Big Robots.txt Files?



Does Google Have A Problem With Big Robots.txt Files?

Google addresses the subject of robots.txt files and whether it’s a good SEO practice to keep them within a reasonable size.

This topic is discussed by Google’s Search Advocate John Mueller during the Google Search Central SEO office-hours hangout recorded on January 14.

David Zieger, an SEO manager for a large news publisher in Germany, joins the livestream with concerns about a “huge” and “complex” robots.txt file.

How huge are we talking here?

Zieger says there’s over 1,500 lines with a “multitude” of disallows that keeps growing over the years.

The disallows prevent Google from indexing HTML fragments and URLs where AJAX calls are used.

Zieger says it’s not possible to set a noindex, which is another way to keep the fragments and URLs out of Google’s index, so he’s resorted to filling the site’s robots.txt with disallows.


Are there any negative SEO effects that can result from a huge robots.txt file?

Here’s what Mueller says.

SEO Considerations For Large Robots.txt Files

A large robots.txt file will not directly cause any negative impact to a site’s SEO.

However, a large file is harder to maintain, which may lead to accidental issues down the road.

Mueller explains:

“No direct negative SEO issues with that, but it makes it a lot harder to maintain. And it makes it a lot easier to accidentally push something that does cause issues.

So just because it’s a large file doesn’t mean it’s a problem, but it makes it easier for you to create problems.”

Zieger follows up by asking if there are any issues with not including a sitemap in the robots.txt file.

Mueller says that’s not a problem:


“No. Those different ways of submitting a sitemap are all equivalent for us.”

Zieger then launches into a several more follow-up questions that we’ll take a look at in the next section.

Does Google Recognize HTML Fragments?

Zieger asks Mueller what would be the SEO impact of radically shortening the robots.txt file. Such as removing all the disallows, for example.

The following questions are asked:

  • Does Google recognize HTML fragments that aren’t relevant to site visitors?
  • Would HTML fragments end up in Google’s search index if they weren’t disallowed in robots.txt?
  • How does Google deal with pages where AJAX calls are used? (Such as a header or footer element)

He sums up his questions by stating most of what’s disallowed in his robots.txt file are header and footer elements that aren’t interesting for the user.

Mueller says it’s difficult to know exactly what would happen if those fragments were suddenly allowed to be indexed.

A trial and error approach might be the best way of figuring this out, Mueller explains:

“It’s hard to say what you mean with regards to those fragments

My thought there would be to try to figure out how those fragment URLs are used. And if you’re unsure, maybe take one of these fragment URLs and allow its crawling, and look at the content of that fragment URL, and then check to see what happens in search.

Does it affect anything with regards to the indexed content on your site?
Is some of that content findable within your site suddenly?
Is that a problem or not?

And try to work based on that, because it’s very easy to block things by robots.txt, which actually are not used for indexing, and then you spend a lot of time maintaining this big robots.txt file, but it actually doesn’t change that much for your website.”


Other Considerations For Building A Robots.txt File

Zieger has one last follow-up regarding robots.txt files, asking if there are any specific guidelines to follow when building one.

Mueller says there’s no specific format to follow:

“No, it’s essentially up to you. Like some sites have big files, some sites have small files, they should all just work.

We have an open source code of the robots.txt parser that we use. So what you can also do is get your developers to run that parser for you, or kind of set it up so that you can test it, and then check the URLs on your website with that parser to see which URLs would actually get blocked and what that would change. And that way you can test things before you make them live.”

The robots.txt parser Mueller refers to can be found on Github.

Hear the full discussion in the video below:

Featured Image: Screenshot from, January 2022.

if( typeof sopp !== "undefined" && sopp === 'yes' ){ fbq('dataProcessingOptions', ['LDU'], 1, 1000); }else{ fbq('dataProcessingOptions', []); }

fbq('init', '1321385257908563');

fbq('track', 'PageView');

fbq('trackSingle', '1321385257908563', 'ViewContent', { content_name: 'does-google-have-a-problem-with-big-robots-txt-files', content_category: 'news seo ' });

Source link

See also  New Google Best Practice For Embedded Videos


How Data Is Reshaping The SEO & Digital Marketer’s Landscape



How Data Is Reshaping The SEO & Digital Marketer's Landscape

There is a new data revolution happening, and it’s sweeping across the industry so quickly that many SEO and digital marketers are struggling to make sense of the insights and intelligence at their disposal.

To utilize this opportunity, marketers need to evolve their mindsets and use technology to analyze multiple data formats and understand the new opportunities it can bring.

SEO marketers of today and digital marketers of tomorrow will need to extract, structure quickly, and manipulate data to drive the most critical business outcomes.

Data has always been mission-critical to digital decision-making.

The Economist, back in 2017, declared it the world’s most valuable resource.

Fast forward to today and the future, and we can see that the exponential growth of data fuelling this revolution is staggering.

According to the IDC, the amount of digital data created over the next five years will be greater than twice the amount of data made since the advent of digital storage.


Think about that for a second!

Flash drives, for example, were introduced in 2000.

This means that in the next five years, marketers will have to analyze and make sense of 2x the data created in the last 22 years!

The Data Revolution Means More Sources & Complexity For SEO

The data revolution has gone on for some time now, and it’s changed our concept of what counts as “data,” rightfully so.

In the past, we thought only numbers mattered.

But, in this new digital world where everything is converted into ones and zeros, data is broader and contains text, audio, and visual information – all bits waiting to be processed!

  • Machine-based and human-generated data are growing at a rate of 10x faster than conventional business data.
  • Machine-created data is increasing exponentially at a 50x the growth rate. This data revolution is primarily marketing-driven and consumer-oriented who are “always on.”
  • In just the last 18 months, the volume of site processing data we have been generating at BrightEdge has increased by 11x!
See also  New Google Best Practice For Embedded Videos

As a result of these increasingly demanding trends, SEO and digital marketers need to adapt and become more like data analysts and scientists in approaching the extraction of structured data insights and business intelligence – without adding more manual work.

Fortunately, SEO is well-positioned to take advantage of this new data revolution.

  • Increasing your keyword universe – More keywords mean more data points to look at with reporting and fuelling insights. While focusing on conversion rate metrics is very important, it wouldn’t be possible without opening the scope of your audience and getting more people in the door. SEO has drifted away from writing for a primary dedicated keyword and is now way more advanced with advancements in search engines like Google’s understanding of intent of searches through RankBrain and BERT.
  • Increasing your search footprint – will also help you discover unexplored of informing your future content strategy or ideate new keyword ideas. However, sometimes you might miss the boat, like the transition of Content Management Systems slowly turning into “Experience Platforms” as they offer more functionality to meet the needs of today’s webmaster or marketer.

Read More On SEJ

Data Is The Currency Of An Accelerated SEO & Digital Age

By 2025, Worldwide data will reach 175 zettabytes.

But unfortunately, the human brain can’t process, structure, and analyze all that data.


So technology engines have to help, and digital marketers should be the driver.

There is a massive opportunity for companies that can utilize data to create more engaging experiences.

A recent study showed that 95% of business leaders recognize this as their biggest growth lever over the next three years, which means there’s plenty at stake here!

See also  [Updated for 2021] 27-Point Checklist: How to Write for Google

Robust data analysis ensures decisions are evidence-based and have more accountability.

Drawing on existing and new data sources to fully integrate business acumen and analytical skills into decision making, sourcing, managing, and analyzing large amounts of unstructured data will ensure continued use and success.

SEO began with data and has evolved.

From the introduction of real-time SEO in 2019 and Page Experience Updates in 2021, SEO’s future lies again with data and the creation of intelligent systems. Here marketers can leverage combined data sources that structures data for them.

As a result, they can achieve business objectives and stay ahead during all data and digital transformation stages.


Read More On SEJ

Technology & AI Are Helping SEO Evolve

Advancements in technology and, in particular, AI and Natural Language Processing has meant that SEO and digital marketers can become data analysts without having to become an actual data scientist.

This is key to unlocking structured insights from your company’s big data to make more precise predictions about what is coming next based on existing information.

Digital marketers can evolve, understand key trends, and learn in new areas such as:

  • Predictive modeling of future trends and forecasting based on multiple types of data.
  • Real-time identification of opportunities and intelligence.
  • Digital research at scale with both historical and real-time data.
  • Leveraging automated visualizations for various stakeholders.
  • Improved data security and compliance.
  • Market and business intelligence at a macro level.
  • Consumer behavior at the most granular level.

SEO and digital marketers can learn critical skills such as statistics, data analysis, data visualization, and strategy.

See also  We're Looking For A Unicorn...

AI, NLP, and machine learning are helping them do this without needing expertise in computer programming and software applications.

What digital marketers must do is combine their communication skills and analytics skills with stakeholders who cannot think outside of the advertising box.

Read More On SEJ

Data Analysis & Intelligence As Competitive Advantage

The application of technology will be the driving force behind the next generation of data analysis technology.

Therefore, SEO and digital marketers of today should learn how to better utilize insights from data analysis.

It’s becoming more apparent that the marketing platforms of tomorrow will require the capabilities of data analysis and science infrastructure at their core.


The future of marketing will blend technological know-how, business sense, and an understanding of data analysis.

The next generation of SEO will touch all components of marketing, from video, email, and voice, to digital performance of content.

SEO and data science will converge into one evolved discipline that drives omnichannel acquisition and democratizes data.

Marketers who embrace this new era of SEO will be well-positioned to succeed in the years to come.

Data is reconfirming its role as the new competitive advantage, and as SEO and digital marketers, you must evolve if you want to be part of the future.

More resources:

Featured Image: ra2 studio/Shutterstock



if( typeof sopp !== “undefined” && sopp === ‘yes’ ){
fbq(‘dataProcessingOptions’, [‘LDU’], 1, 1000);
fbq(‘dataProcessingOptions’, []);

fbq(‘init’, ‘1321385257908563’);

fbq(‘track’, ‘PageView’);

fbq(‘trackSingle’, ‘1321385257908563’, ‘ViewContent’, {
content_name: ‘seo-data-and-digital-marketer’,
content_category: ‘trends-digital enterprise’

Source link

Continue Reading

Subscribe To our Newsletter
We promise not to spam you. Unsubscribe at any time.
Invalid email address