Connect with us

SEO

Google Confirms Robots.txt Can’t Prevent Unauthorized Access

Published

on

Google Confirms Robots.txt Can't Prevent Unauthorized Access

Google’s Gary Illyes confirmed a common observation that robots.txt has limited control over unauthorized access by crawlers. Gary then offered an overview of access controls that all SEOs and website owners should know.

Common Argument About Robots.txt

Seems like any time the topic of Robots.txt comes up there’s always that one person who has to point out that it can’t block all crawlers.

Gary agreed with that point:

“robots.txt can’t prevent unauthorized access to content”, a common argument popping up in discussions about robots.txt nowadays; yes, I paraphrased. This claim is true, however I don’t think anyone familiar with robots.txt has claimed otherwise.”

Next he took a deep dive on deconstructing what blocking crawlers really means. He framed the process of blocking crawlers as choosing a solution that inherently controls or cedes control to a website. He framed it as a request for access (browser or crawler) and the server responding in multiple ways.

He listed examples of control:

  • A robots.txt (leaves it up to the crawler to decide whether or not to crawl).
  • Firewalls (WAF aka web application firewall – firewall controls access)
  • Password protection

Here are his remarks:

“If you need access authorization, you need something that authenticates the requestor and then controls access. Firewalls may do the authentication based on IP, your web server based on credentials handed to HTTP Auth or a certificate to its SSL/TLS client, or your CMS based on a username and a password, and then a 1P cookie.

There’s always some piece of information that the requestor passes to a network component that will allow that component to identify the requestor and control its access to a resource. robots.txt, or any other file hosting directives for that matter, hands the decision of accessing a resource to the requestor which may not be what you want. These files are more like those annoying lane control stanchions at airports that everyone wants to just barge through, but they don’t.

There’s a place for stanchions, but there’s also a place for blast doors and irises over your Stargate.

TL;DR: don’t think of robots.txt (or other files hosting directives) as a form of access authorization, use the proper tools for that for there are plenty.”

Use The Proper Tools To Control Bots

There are many ways to block scrapers, hacker bots, search crawlers, visits from AI user agents and search crawlers. Aside from blocking search crawlers, a firewall of some type is a good solution because they can block by behavior (like crawl rate), IP address, user agent, and country, among many other ways. Typical solutions can be at the server level with something like Fail2Ban, cloud based like Cloudflare WAF, or as a WordPress security plugin like Wordfence.

Read Gary Illyes post on LinkedIn:

robots.txt can’t prevent unauthorized access to content

Featured Image by Shutterstock/Ollyy

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address

SEO

Mediavine Bans Publisher For Overuse Of AI-Generated Content

Published

on

By

Single continuous line drawing robot sitting near piles of work files.

According to details surfacing online, ad management firm Mediavine is terminating publishers’ accounts for overusing AI.

Mediavine is a leading ad management company providing products and services to help website publishers monetize their content.

The company holds elite status as a Google Certified Publishing Partner, which indicates that it meets Google’s highest standards and requirements for ad networks and exchanges.

AI Content Triggers Account Terminations

The terminations came to light in a post on the Reddit forum r/Blogging, where a user shared an email they received from Mediavine citing “overuse of artificially created content.”

Trista Jensen, Mediavine’s Director of Ad Operations & Market Quality, states in the email:

“Our third party content quality tools have flagged your sites for overuse of artificially created content. Further internal investigation has confirmed those findings.”

Jensen stated that due to the overuse of AI content, “our top partners will stop spending on your sites, which will negatively affect future monetization efforts.”

Consequently, Mediavine terminated the publisher’s account “effective immediately.”

The Risks Of Low-Quality AI Content

This strict enforcement aligns with Mediavine’s publicly stated policy prohibiting websites from using “low-quality, mass-produced, unedited or undisclosed AI content that is scraped from other websites.”

In a March 7 blog post titled “AI and Our Commitment to a Creator-First Future,” the company declared opposition to low-value AI content that could “devalue the contributions of legitimate content creators.”

Mediavine warned in the post:

“Without publishers, there is no open web. There is no content to train the models that power AI. There is no internet.”

The company says it’s using its platform to “advocate for publishers” and uphold quality standards in the face of AI’s disruptive potential.

Mediavine states:

“We’re also developing faster, automated tools to help us identify low-quality, mass-produced AI content across the web.”

Targeting ‘AI Clickbait Kingpin’ Tactics

While the Reddit user’s identity wasn’t disclosed, the incident has drawn connections to the tactics of Nebojša Vujinović Vujo, who was dubbed an “AI Clickbait Kingpin” in a recent Wired exposé.

According to Wired, Vujo acquired over 2,000 dormant domains and populated them with AI-generated, search-optimized content designed purely to capture ad revenue.

His strategies represent the low-quality, artificial content Mediavine has vowed to prohibit.

Potential Implications

Lost Revenue

Mediavine’s terminations highlight potential implications for publishers that rely on artificial intelligence to generate website content at scale.

Perhaps the most immediate and tangible implication is the risk of losing ad revenue.

For publishers that depend heavily on programmatic advertising or sponsored content deals as key revenue drivers, being blocked from major ad networks could devastate their business models.

Devalued Domains

Another potential impact is the devaluation of domains and websites built primarily on AI-generated content.

If this pattern of AI content overuse triggers account terminations from companies like Mediavine, it could drastically diminish the value proposition of scooping up these domains.

Damaged Reputations & Brands

Beyond the lost monetization opportunities, publishers leaning too heavily into automated AI content also risk permanent reputational damage to their brands.

Once a determining authority flags a website for AI overuse, it could impact how that site is perceived by readers, other industry partners, and search engines.

In Summary

AI has value as an assistive tool for publishers, but relying heavily on automated content creation poses significant risks.

These include monetization challenges, potential reputation damage, and increasing regulatory scrutiny. Mediavine’s strict policy illustrates the possible consequences for publishers.

It’s important to note that Mediavine’s move to terminate publisher accounts over AI content overuse represents an independent policy stance taken by the ad management firm itself.

The action doesn’t directly reflect the content policies or enforcement positions of Google, whose publishing partner program Mediavine is certified under.

We have reached out to Mediavine requesting a comment on this story. We’ll update this article with more information when it’s provided.


Featured Image: Simple Line/Shutterstock

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

Google’s Guidance About The Recent Ranking Update

Published

on

By

Google issues a statement about their recent algorithm update

Google’s Danny Sullivan explained the recent update, addressing site recoveries and cautioning against making radical changes to improve rankings. He also offered advice for publishes whose rankings didn’t improve after the last update.

Google’s Still Improving The Algorithm

Danny said that Google is still working on their ranking algorithm, indicating that more changes (for the positive) are likely on the way. The main idea he was getting across is that they’re still trying to fill the gaps in surfacing high quality content from independent sites. Which is good because big brand sites don’t necessarily have the best answers.

He wrote:

“…the work to connect people with “a range of high quality sites, including small or independent sites that are creating useful, original content” is not done with this latest update. We’re continuing to look at this area and how to improve further with future updates.”

A Message To Those Who Were Left Behind

There was a message to those publishers whose work failed to recover with the latest update, to let them know that Google is still working to surface more of the independent content and that there may be relief on the next go.

Danny advised:

“…if you’re feeling confused about what to do in terms of rankings…if you know you’re producing great content for your readers…If you know you’re producing it, keep doing that…it’s to us to keep working on our systems to better reward it.”

Google Cautions Against “Improving” Sites

Something really interesting that he mentioned was a caution against trying to improve rankings of something that’s already on page one in order to rank even higher. Tweaking a site to get from position six or whatever to something higher has always been a risky thing to do for many reasons I won’t elaborate on here. But Danny’s warning increases the pressure to not just think twice before trying to optimize a page for search engines but to think three times and then some more.

Danny cautioned that sites that make it to the top of the SERPs should consider that a win and to let it ride instead of making changes right now in order to improve their rankings. The reason for that caution is that the search results continue to change and the implication is that changing a site now may negatively impact the rankings in a newly updated search index.

He wrote:

“If you’re showing in the top results for queries, that’s generally a sign that we really view your content well. Sometimes people then wonder how to move up a place or two. Rankings can and do change naturally over time. We recommend against making radical changes to try and move up a spot or two”

How Google Handled Feedback

There was also some light shed on what Google did with all the feedback they received from publishers who lost rankings. Danny wrote that the feedback and site examples he received was summarized, with examples, and sent to the search engineers for review. They continue to use that feedback for the next round of improvements.

He explained:

“I went through it all, by hand, to ensure all the sites who submitted were indeed heard. You were, and you continue to be. …I summarized all that feedback, pulling out some of the compelling examples of where our systems could do a better job, especially in terms of rewarding open web creators. Our search engineers have reviewed it and continue to review it, along with other feedback we receive, to see how we can make search better for everyone, including creators.”

Feedback Itself Didn’t Lead To Recovery

Danny also pointed out that sites that recovered their rankings did not do so because of they submitted feedback to Google. Danny wasn’t specific about this point but it conforms with previous statements about Google’s algorithms that they implement fixes at scale. So instead of saying, “Hey let’s fix the rankings of this one site” it’s more about figuring out if the problem is symptomatic of something widescale and how to change things for everybody with the same problem.

Danny wrote:

“No one who submitted, by the way, got some type of recovery in Search because they submitted. Our systems don’t work that way.”

That feedback didn’t lead to recovery but was used as data shouldn’t be surprising. Even as far back as the 2004 Florida Update Matt Cutts collected feedback from people, including myself, and I didn’t see a recovery for a false positive until everyone else also got back their rankings.

Takeaways

Google’s work on their algorithm is ongoing:
Google is continuing to tune its algorithms to improve its ability to rank high quality content, especially from smaller publishers. Danny Sullivan emphasized that this is an ongoing process.

What content creators should focus on:
Danny’s statement encouraged publishers to focus on consistently creating high quality content and not to focus on optimizing for algorithms. Focusing on quality should be the priority.

What should publishers do if their high-quality content isn’t yet rewarded with better rankings?
Publishers who are certain of the quality of their content are encouraged to hold steady and keep it coming because Google’s algorithms are still being refined.

Read the post on LinkedIn.

Featured Image by Shutterstock/Cast Of Thousands

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

Plot Up To Five Metrics At Once

Published

on

By

Plot Up To Five Metrics At Once

Google has rolled out changes to Analytics, adding features to help you make more sense of your data.

The update brings several key improvements:

  • You can now compare up to five different metrics side by side.
  • A new tool automatically spots unusual trends in your data.
  • A more detailed report on transactions gives a closer look at revenue.
  • The acquisition reports now separate user and session data more clearly.
  • It’s easier to understand what each report does with new descriptions.

Here’s an overview of these new features, why they matter, and how they might help improve your data analysis and decision-making.

Plot Rows: Enhanced Data Visualization

The most prominent addition is the “Plot Rows” feature.

You can now visualize up to five rows of data simultaneously within your reports, allowing for quick comparisons and trend analysis.

This feature is accessible by selecting the desired rows and clicking the “Plot Rows” option.

Anomaly Detection: Spotting Unusual Patterns

Google Analytics has implemented an anomaly detection system to help you identify potential issues or opportunities.

This new tool automatically flags unusual data fluctuations, making it easier to spot unexpected traffic spikes, sudden drops, or other noteworthy trends.

Improved Report Navigation & Understanding

Google Analytics has added hover-over descriptions for report titles.

These brief explanations provide context and include links to more detailed information about each report’s purpose and metrics.

Key Event Marking In Events Report

The Events report allows you to mark significant events for easy reference.

This feature, accessed through a three-dot menu at the end of each event row, helps you prioritize and track important data points.

New Transactions Report For Revenue Insights

For ecommerce businesses, the new Transactions report offers granular insights into revenue streams.

This feature provides information about each transaction, utilizing the transaction_id parameter to give you a comprehensive view of sales data.

Scope Changes In Acquisition Reports

Google has refined its acquisition reports to offer more targeted metrics.

The User Acquisition report now includes user-related metrics such as Total Users, New Users, and Returning Users.

Meanwhile, the Traffic Acquisition report focuses on session-related metrics like Sessions, Engaged Sessions, and Sessions per Event.

What To Do Next

As you explore these new features, keep in mind:

  • Familiarize yourself with the new Plot Rows function to make the most of comparative data analysis.
  • Pay attention to the anomaly detection alerts, but always investigate the context behind flagged data points.
  • Take advantage of the more detailed Transactions report to understand your revenue patterns better.
  • Experiment with the refined acquisition reports to see which metrics are most valuable for your needs.

As with any new tool, there will likely be a learning curve as you incorporate these features into your workflow.


FAQ

What is the “Plot Rows” feature in Google Analytics?

The “Plot Rows” feature allows you to visualize up to five rows of data at the same time. This makes it easier to compare different metrics side by side within your reports, facilitating quick comparisons and trend analysis. To use this feature, select the desired rows and click the “Plot Rows” option.

How does the new anomaly detection system work in Google Analytics?

Google Analytics’ new anomaly detection system automatically flags unusual data patterns. This tool helps identify potential issues or opportunities by spotting unexpected traffic spikes, sudden drops, or other notable trends, making it easier for users to focus on significant data fluctuations.

What improvements have been made to the Transactions report in Google Analytics?

The enhanced Transactions report provides detailed insights into revenue for ecommerce businesses. It utilizes the transaction_id parameter to offer granular information about each transaction, helping businesses get a better understanding of their revenue streams.


Featured Image: Vladimka production/Shutterstock



Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

Trending