Connect with us

SEO

Bingbot User Agent is Changing

Published

on

Bingbot User Agent is Changing

Bing announced that it is changing the user agent string that identifies itself as Bingbot. Now there will be two user agents, one for desktop and another for the mobile crawler.

The new user agent provides more information, including the latest version of Edge.

User Agents

A user agent is a message from a software that tells the website server what kind of software device is visiting it.

A user agent can tell a website that it’s a browser or a search engine bot.

Some website software and SaaS applications rely on the user agent in order to identify search bots and do something like keep them from visiting the search or registration pages.

Bing User Agent Change is a Transition

Microsoft will continue to use the old bingbot crawler user agent until Fall 2022, which is a vague date.

However, the Bing Webmaster Tools URL inspection tool has already switched over to the new bingbot user agent.

Advertisement

There are apps and software that may need updating, so this gives them several months to update their software so that they can accurately identify bingbot.

But there is no need to update a robots.txt because the new version can continue to be addressed in the robots.txt as bingbot.

Robots.txt will not need to be updated to accommodate the new bingbot user agent string.

Nevertheless, for software and services that do partial matches, such as phpBB, updating code to accommodate the new user agent will not be necessary.

If a publisher, app or software uses bingbot/ to identify bingbot with a partial map then that code will continue to work.

See also  AI Lawyers Demanding Links, Google Not Stealing Content, Old Bingbot, Google URL Parameter Tool & AdWords API Gone & More

Old and New User Agents

This is the old bingbot user agent:

Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

The new bingbot user agents are an improvement because they identify between the desktop and mobile versions of the Microsoft Bing web crawler.

These are the new bingbot user agents:

Advertisement

Desktop

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/W.X.Y.Z Safari/537.36

Mobile

Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

According to the official Microsoft Bing announcement:

“We will carefully test websites before switching them to our new user-agent Bing Webmaster Tools URL Inspection has already started using the new desktop user-agent for the Live URL Test to help you investigate potential issues.

In case you face any challenges feel free to contact us.”

Citation

Read the official Microsoft Bing Announcement

Announcing user-agent change for Bing crawler bingbot

fbq('init', '1321385257908563');

fbq('track', 'PageView');

fbq('trackSingle', '1321385257908563', 'ViewContent', { content_name: 'new-bingbot-user-agent', content_category: 'news seo ' });



Source link

SEO

Googlebot Crawls & Indexes First 15 MB HTML Content

Published

on

Googlebot Crawls & Indexes First 15 MB HTML Content

In an update to Googlebot’s help document, Google quietly announced it will crawl the first 15 MB of a webpage. Anything after this cutoff will not be included in rankings calculations.

Google specifies in the help document:

“Any resources referenced in the HTML such as images, videos, CSS and JavaScript are fetched separately. After the first 15 MB of the file, Googlebot stops crawling and only considers the first 15 MB of the file for indexing. The file size limit is applied on the uncompressed data.”

This left some in the SEO community wondering if this meant Googlebot would completely disregard text that fell below images at the cutoff in HTML files.

“It’s specific to the HTML file itself, like it’s written,” John Mueller, Google Search Advocate, clarified via Twitter. “Embedded resources/content pulled in with IMG tags is not a part of the HTML file.”

What This Means For SEO

To ensure it is weighted by Googlebot, important content must now be included near the top of webpages. This means code must be structured in a way that puts the SEO-relevant information with the first 15 MB in an HTML or supported text-based file.

It also means images and videos should be compressed not be encoded directly into the HTML, whenever possible.

SEO best practices currently recommend keeping HTML pages to 100 KB or less, so many sites will be unaffected by this change. Page size can be checked with a variety of tools, including Google Page Speed Insights.

Advertisement

In theory, it may sound worrisome that you could potentially have content on a page that doesn’t get used for indexing. In practice, however, 15MB is a considerably large amount of HTML.

As Google states, resources such as images and videos are fetched separately. Based on Google’s wording, it sounds like this 15MB cutoff applies to HTML only.

It would be difficult to go over that limit with HTML unless you were publishing entire books’ worth of text on a single page.

Should you have pages that exceed 15MB of HTML it’s likely you have underlying issues that need to be fixed anyway.


Source: Google Search Central
Featured Image: SNEHIT PHOTO/Shutterstock

if( typeof sopp !== "undefined" && sopp === 'yes' ){ fbq('dataProcessingOptions', ['LDU'], 1, 1000); }else{ fbq('dataProcessingOptions', []); }

fbq('init', '1321385257908563');

fbq('track', 'PageView');

fbq('trackSingle', '1321385257908563', 'ViewContent', { content_name: 'googlebot-crawls-indexes-first-15-mb-html-content', content_category: 'news seo' }); }



Source link

See also  How Estimated Reading Times Increase User Content Engagement
Continue Reading

DON'T MISS ANY IMPORTANT NEWS!
Subscribe To our Newsletter
We promise not to spam you. Unsubscribe at any time.
Invalid email address

Trending

en_USEnglish