Connect with us

MARKETING

How to Use Chrome to View a Website as Googlebot

Published

on

How to Use Chrome to View a Website as Googlebot

The author’s views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

Introduction to Googlebot spoofing

In this article, I’ll describe how and why to use Google Chrome (or Chrome Canary) to view a website as Googlebot.

We’ll set up a web browser specifically for Googlebot browsing. Using a user-agent browser extension is often close enough for SEO audits, but extra steps are needed to get as close as possible to emulating Googlebot.

Skip to “How to set up your Googlebot browser”.

Why should I view a website as Googlebot?

For many years, us technical SEOs had it easy when auditing websites, with HTML and CSS being web design’s cornerstone languages. JavaScript was generally used for embellishments (such as small animations on a webpage).

Increasingly, though, whole websites are being built with JavaScript.

Originally, web servers sent complete websites (fully rendered HTML) to web browsers. These days, many websites are rendered client-side (in the web browser itself) – whether that’s Chrome, Safari, or whatever browser a search bot uses – meaning the user’s browser and device must do the work to render a webpage.

SEO-wise, some search bots don’t render JavaScript, so won’t see webpages built using it. Especially when compared to HTML and CSS, JavaScript is very expensive to render. It uses much more of a device’s processing power — wasting the device’s battery life— and much more of Google’s, Bing’s, or any search engine’s server resource.

Even Googlebot has difficulties rendering JavaScript and delays rendering of JavaScript beyond its initial URL discovery – sometimes for days or weeks, depending on the website. When I see “Discovered – currently not indexed” for several URLs in Google Search Console’s Coverage (or Pages) section, the website is more often than not JavaScript-rendered.

Attempting to get around potential SEO issues, some websites use dynamic rendering, so each page has two versions:

Generally, I find that this setup overcomplicates websites and creates more technical SEO issues than a server-side rendered or traditional HTML website. A mini rant here: there are exceptions, but generally, I think client-side rendered websites are a bad idea. Websites should be designed to work on the lowest common denominator of a device, with progressive enhancement (through JavaScript) used to improve the experience for people, using devices that can handle extras. This is something I will investigate further, but my anecdotal evidence suggests client-side rendered websites are generally more difficult to use for people who rely on accessibility devices such as a screen reader. There are instances where technical SEO and usability crossover.

Technical SEO is about making websites as easy as possible for search engines to crawl, render, and index (for the most relevant keywords and topics). Like it or lump it, the future of technical SEO, at least for now, includes lots of JavaScript and different webpage renders for bots and users.

Viewing a website as Googlebot means we can see discrepancies between what a person sees and what a search bot sees. What Googlebot sees doesn’t need to be identical to what a person using a browser sees, but main navigation and the content you want the page to rank for should be the same.

That’s where this article comes in. For a proper technical SEO audit, we need to see what the most common search engine sees. In most English language-speaking countries, at least, that’s Google.

Why use Chrome (or Chrome Canary) to view websites as Googlebot?

Can we see exactly what Googlebot sees?

No.

Googlebot itself uses a (headless) version of the Chrome browser to render webpages. Even with the settings suggested in this article, we can never be exactly sure of what Googlebot sees. For example, no settings allow for how Googlebot processes JavaScript websites. Sometimes JavaScript breaks, so Googlebot might see something different than what was intended.

The aim is to emulate Googlebot’s mobile-first indexing as closely as possible.

When auditing, I use my Googlebot browser alongside Screaming Frog SEO Spider’s Googlebot spoofing and rendering, and Google’s own tools such as URL Inspection in Search Console (which can be automated using SEO Spider), and the render screenshot and code from the Mobile Friendly Test.

Even Google’s own publicly available tools aren’t 100% accurate in showing what Googlebot sees. But along with the Googlebot browser and SEO Spider, they can point towards issues and help with troubleshooting.

Why use a separate browser to view websites as Googlebot?

1. Convenience

Having a dedicated browser saves time. Without relying on or waiting for other tools, I get an idea of how Googlebot sees a website in seconds.

While auditing a website that served different content to browsers and Googlebot, and where issues included inconsistent server responses, I needed to switch between the default browser user-agent and Googlebot more often than usual. But constant user-agent switching using a Chrome browser extension was inefficient.

Some Googlebot-specific Chrome settings don’t save or transport between browser tabs or sessions. Some settings affect all open browser tabs. E.g., disabling JavaScript may stop websites in background tabs that rely on JavaScript from working (such as task management, social media, or email applications).

Aside from having a coder who can code a headless Chrome solution, the “Googlebot browser” setup is an easy way to spoof Googlebot.

2. Improved accuracy

Browser extensions can impact how websites look and perform. This approach keeps the number of extensions in the Googlebot browser to a minimum.

3. Forgetfulness

It’s easy to forget to switch Googlebot spoofing off between browsing sessions, which can lead to websites not working as expected. I’ve even been blocked from websites for spoofing Googlebot, and had to email them with my IP to remove the block.

For which SEO audits are a Googlebot browser useful?

The most common use-case for SEO audits is likely websites using client-side rendering or dynamic rendering. You can easily compare what Googlebot sees to what a general website visitor sees.

Even with websites that don’t use dynamic rendering, you never know what you might find by spoofing Googlebot. After over eight years auditing e-commerce websites, I’m still surprised by issues I haven’t come across before.

Example Googlebot comparisons for technical SEO and content audits:

  • Is the main navigation different?

  • Is Googlebot seeing the content you want indexed?

  • If a website relies on JavaScript rendering, will new content be indexed promptly, or so late that its impact is reduced (e.g. for forthcoming events or new product listings)?

  • Do URLs return different server responses? For example, incorrect URLs can return 200 OK for Googlebot but 404 Not Found for general website visitors.

  • Is the page layout different to what the general website visitor sees? For example, I often see links as blue text on a black background when spoofing Googlebot. While machines can read such text, we want to present something that looks user-friendly to Googlebot. If it can’t render your client-side website, how will it know? (Note: a website might display as expected in Google’s cache, but that isn’t the same as what Googlebot sees.)

  • Do websites redirect based on location? Googlebot mostly crawls from US-based IPs.

It depends how in-depth you want to go, but Chrome itself has many useful features for technical SEO audits. I sometimes compare its Console and Network tab data for a general visitor vs. a Googlebot visit (e.g. Googlebot might be blocked from files that are essential for page layout or are required to display certain content).

How to set up your Googlebot browser

Once set up (which takes about a half hour), the Googlebot browser solution makes it easy to quickly view webpages as Googlebot.

Step 1: Download and install Chrome or Canary

If Chrome isn’t your default browser, use it as your Googlebot browser.

If Chrome is your default browser, download and install Chrome Canary. Canary is a development version of Chrome where Google tests new features, and it can be installed and run separately to Chrome’s default version.

Named after the yellow canaries used to detect poisonous gases in mines, with its yellow icon, Canary is easy to spot in the Windows Taskbar:

Screenshot of the yellow Chrome Canary icon in a Windows 10 taskbar

As Canary is a development version of Chrome, Google warns that Canary “can be unstable.” But I’m yet to have issues using it as my Googlebot browser.

Step 2: Install browser extensions

I installed five browser extensions and a bookmarklet on my Googlebot browser. I’ll list the extensions, then advise on settings and why I use them.

For emulating Googlebot (the links are the same whether you use Chrome or Canary):

Not required to emulate Googlebot, but my other favorites for technical SEO auditing of JavaScript websites:

User-Agent Switcher extension

User-Agent Switcher does what it says on the tin: switches the browser’s user-agent. Chrome and Canary have a user-agent setting, but it only applies to the tab you’re using and resets if you close the browser.

I take the Googlebot user-agent string from Chrome’s browser settings, which at the time of writing will be the latest version of Chrome (note that below, I’m taking the user-agent from Chrome and not Canary).

To get the user-agent, access Chrome DevTools (by pressing F12 or using the hamburger menu to the top-right of the browser window, then navigating to More tools > Developer tools). See the screenshot below or follow these steps:

  1. Go to the Network tab

  2. From the top-right Network hamburger menu: More tools > Network conditions

  3. Click the Network conditions tab that appears lower down the window

  4. Untick “Use browser default”

  5. Select “Googlebot Smartphone” from the list, then copy and paste the user-agent from the field below the list into the User-Agent Switcher extension list (another screenshot below). Don’t forget to switch Chrome back to its default user-agent if it’s your main browser.
    • At this stage, if you’re using Chrome (and not Canary) as your Googlebot browser, you may as well tick “Disable cache” (more on that later).

Screenshot of DevTools showing the steps described above

To access User-Agent Switcher’s list, right-click its icon in the browser toolbar and click Options (see screenshot below). “Indicator Flag” is text that appears in the browser toolbar to show which user-agent has been selected — I chose GS to mean “Googlebot Smartphone:”

Screenshot showing User-Agent Switcher options described in the paragraph above

I added Googlebot Desktop and the bingbots to my list, too.

Why spoof Googlebot’s user agent?

Web servers detect what is browsing a website from a user-agent string. For example, the user-agent for a Windows 10 device using the Chrome browser at the time of writing is:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.115 Safari/537.36

If you’re interested in why other browsers seem to be named in the Chrome user-agent string, read History of the user-agent string.

Web Developer extension

Web Developer is a must-have browser extension for technical SEOs. In my Googlebot browser, I switch between disabling and enabling JavaScript to see what Googlebot might see with and without JavaScript.

Why disable JavaScript?

Short answer: Googlebot doesn’t execute any/all JavaScript when it first crawls a URL. We want to see a webpage before any JavaScript is executed.

Long answer: that would be a whole other article.

Windscribe (or another VPN)

Windscribe (or your choice of VPN) is used to spoof Googlebot’s US location. I use a pro Windscribe account, but the free account allows up to 2GB data transfer a month and includes US locations.

I don’t think the specific US location matters, but I pretend Gotham is a real place (in a time when Batman and co. have eliminated all villains):

Windscribe browser extension showing location set to New York: Gotham, with a background of the United States of America flag behind a blue overlay

Ensure settings that may impact how webpages display are disabled — Windscribe’s extension blocks ads by default. The two icons to the top-right should show a zero.

For the Googlebot browser scenario, I prefer a VPN browser extension to an application, because the extension is specific to my Googlebot browser.

Why spoof Googlebot’s location?

Googlebot mostly crawls websites from US IPs, and there are many reasons for spoofing Googlebot’s primary location.

Some websites block or show different content based on geolocation. If a website blocks US IPs, for example, Googlebot may never see the website and therefore cannot index it.

Another example: some websites redirect to different websites or URLs based on location. If a company had a website for customers in Asia and a website for customers in America, and redirected all US IPs to the US website, Googlebot would never see the Asian version of the website.

Other Chrome extensions useful for auditing JavaScript websites

With Link Redirect Trace, I see at a glance what server response a URL returns.

The View Rendered Source extension enables easy comparison of raw HTML (what the web server delivers to the browser) and rendered HTML (the code rendered on the client-side browser).

I also added the NoJS Side-by-Side bookmarklet to my Googlebot browser. It compares a webpage with and without JavaScript enabled, within the same browser window.

Step 3: Configure browser settings to emulate Googlebot

Next, we’ll configure the Googlebot browser settings in line with what Googlebot doesn’t support when crawling a website.

What doesn’t Googlebot crawling support?

  • Service workers (because people clicking to a page from search results may never have visited before, so it doesn’t make sense to cache data for later visits).

  • Permission requests (e.g. push notifications, webcam, geolocation). If content relies on any of these, Googlebot will not see that content.

  • Googlebot is stateless so doesn’t support cookies, session storage, local storage, or IndexedDB. Data can be stored in these mechanisms but will be cleared before Googlebot crawls the next URL on a website.

These bullet points are summarized from an interview by Eric Enge with Google’s Martin Splitt:

Step 3a: DevTools settings

To open Developer Tools in Chrome or Canary, press F12, or using the hamburger menu to the top-right, navigate to More tools > Developer tools:

Screenshot showing the steps described above to access DevTools

The Developer Tools window is generally docked within the browser window, but I sometimes prefer it in a separate window. For that, change the “Dock side” in the second hamburger menu:

Screenshot showing the 'Dock side' of DevTools
Disable cache

If using normal Chrome as your Googlebot browser, you may have done this already.

Otherwise, via the DevTools hamburger menu, click to More tools > Network conditions and tick the “Disable cache” option:

DevTools screenshot showing the actions described above to disable cache
Block service workers

To block service workers, go to the Application tab > Service Workers > tick “Bypass for network”:

Screenshot showing the steps described above to disable service workers

Step 3b: General browser settings

In your Googlebot browser, navigate to Settings > Privacy and security > Cookies (or visit chrome://settings/cookies directly) and choose the “Block all cookies (not recommended)” option (isn’t it fun to do something “not recommended?”):

Screenshot showing how to block cookies in Chrome settings

Also in the “Privacy and security” section, choose “Site settings” (or visit chrome://settings/content) and individually block Location, Camera, Microphone, Notifications, and Background sync (and likely anything that appears there in future versions of Chrome):

Screenshot of Chrome's privacy settings

Step 4: Emulate a mobile device

Finally, as our aim is to emulate Googlebot’s mobile-first crawling, emulate a mobile device within your Googlebot browser.

Towards the top-left of DevTools, click the device toolbar toggle, then choose a device to emulate in the browser (you can add other devices too):

Screenshot showing mobile device emulation in Chrome

Whatever device you choose, Googlebot doesn’t scroll on webpages, and instead renders using a window with a long vertical height.

I recommend testing websites in desktop view, too, and on actual mobile devices if you have access to them.

How about viewing a website as bingbot?

To create a bingbot browser, use a recent version of Microsoft Edge with the bingbot user agent.

Bingbot is similar to Googlebot in terms of what it does and doesn’t support.

Yahoo! Search, DuckDuckGo, Ecosia, and other search engines are either powered by or based on Bing search, so Bing is responsible for a higher percentage of search than many people realize.

Summary and closing notes

So, there you have your very own Googlebot emulator.

Using an existing browser to emulate Googlebot is the easiest method to quickly view webpages as Googlebot. It’s also free, assuming you already use a desktop device that can install Chrome and/or Canary.

Other tools exist to help “see” what Google sees. I enjoy testing Google’s Vision API (for images) and their Natural Language API.

Auditing JavaScript websites — especially when they’re dynamically rendered — can be complex, and a Googlebot browser is one way of making the process simpler. If you’d like to learn more about auditing JavaScript websites and the differences between standard HTML and JavaScript-rendered websites, I recommend looking up articles and presentations from Jamie Indigo, Joe Hall and Jess Peck. Two of them contribute in the below video. It’s a good introduction to JavaScript SEO and touches on points I mentioned above:

Questions? Something I missed? Tweet me @AlexHarfordSEO. Thanks for reading!



Source link

MARKETING

SEO Recap: ChatGPT – Moz

Published

on

SEO Recap: ChatGPT - Moz

The author’s views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.

We’re back with another SEO recap with Tom Capper! As you’ve probably noticed, ChatGPT has taken the search world by storm. But does GPT-3 mean the end of SEO as we know it, or are there ways to incorporate the AI model into our daily work?

Tom tries to tackle this question by demonstrating how he plans to use ChatGPT, along with other natural language processing systems, in his own work.

Be sure to check out the commentary on ChatGPT from our other Moz subject matter experts, Dr. Pete Meyers and Miriam Ellis:

Video Transcription

Hello, I’m Tom Capper from Moz, and today I want to talk about how I’m going to use ChatGPT and NLP, natural language processing apps in general in my day-to-day SEO tasks. This has been a big topic recently. I’ve seen a lot of people tweeting about this. Some people saying SEO is dead. This is the beginning of the end. As always, I think that’s maybe a bit too dramatic, but there are some big ways that this can be useful and that this will affect SEOs in their industry I think.

The first question I want to ask is, “Can we use this instead of Google? Are people going to start using NLP-powered assistants instead of search engines in a big way?”

So just being meta here, I asked ChatGPT to write a song about Google’s search results being ruined by an influx of AI content. This is obviously something that Google themselves is really concerned about, right? They talked about it with the helpful content update. Now I think the fact that we can be concerned about AI content ruining search results suggests there might be some problem with an AI-powered search engine, right?

No, AI powered is maybe the wrong term because, obviously, Google themselves are at some degree AI powered, but I mean pure, AI-written results. So for example, I stole this from a tweet and I’ve credited the account below, but if you ask it, “What is the fastest marine mammal,” the fastest marine mammal is the peregrine falcon. That is not a mammal.

Then it mentions the sailfish, which is not a mammal, and marlin, which is not a mammal. This is a particularly bad result. Whereas if I google this, great, that is an example of a fast mammal. We’re at least on the right track. Similarly, if I’m looking for a specific article on a specific web page, I’ve searched Atlantic article about the declining quality of search results, and even though clearly, if you look at the other information that it surfaces, clearly this has consumed some kind of selection of web pages, it’s refusing to acknowledge that here.

Whereas obviously, if I google that, very easy. I can find what I’m looking for straightaway. So yeah, maybe I’m not going to just replace Google with ChatGPT just yet. What about writing copy though? What about I’m fed up of having to manually write blog posts about content that I want to rank for or that I think my audience want to hear about?

So I’m just going to outsource it to a robot. Well, here’s an example. “Write a blog post about the future of NLP in SEO.” Now, at first glance, this looks okay. But actually, when you look a little bit closer, it’s a bluff. It’s vapid. It doesn’t really use any concrete examples.

It doesn’t really read the room. It doesn’t talk about sort of how our industry might be affected more broadly. It just uses some quick tactical examples. It’s not the worst article you could find. I’m sure if you pulled a teenager off the street who knew nothing about this and asked them to write about it, they would probably produce something worse than this.

But on the other hand, if you saw an article on the Moz blog or on another industry credible source, you’d expect something better than this. So yeah, I don’t think that we’re going to be using ChatGPT as our copywriter right away, but there may be some nuance, which I’ll get to in just a bit. What about writing descriptions though?

I thought this was pretty good. “Write a meta description for my Moz blog post about SEO predictions in 2023.” Now I could do a lot better with the query here. I could tell it what my post is going to be about for starters so that it could write a more specific description. But this is already quite good. It’s the right length for a meta description. It covers the bases.

It’s inviting people to click. It makes it sound exciting. This is pretty good. Now you’d obviously want a human to review these for the factual issues we talked about before. But I think a human plus the AI is going to be more effective here than just the human or at least more time efficient. So that’s a potential use case.

What about ideating copy? So I said that the pure ChatGPT written blog post wasn’t great. But one thing I could do is get it to give me a list of subtopics or subheadings that I might want to include in my own post. So here, although it is not the best blog post in the world, it has covered some topics that I might not have thought about.

So I might want to include those in my own post. So instead of asking it “write a blog post about the future of NLP in SEO,” I could say, “Write a bullet point list of ways NLP might affect SEO.” Then I could steal some of those, if I hadn’t thought of them myself, as potential topics that my own ideation had missed. Similarly you could use that as a copywriter’s brief or something like that, again in addition to human participation.

My favorite use case so far though is coding. So personally, I’m not a developer by trade, but often, like many SEOs, I have to interact with SQL, with JavaScript, with Excel, and these kinds of things. That often results in a lot of googling from first principles for someone less experienced in those areas.

Even experienced coders often find themselves falling back to Stack Overflow and this kind of thing. So here’s an example. “Write an SQL query that extracts all the rows from table2 where column A also exists as a row in table1.” So that’s quite complex. I’ve not really made an effort to make that query very easy to understand, but the result is actually pretty good.

It’s a working piece of SQL with an explanation below. This is much quicker than me figuring this out from first principles, and I can take that myself and work it into something good. So again, this is AI plus human rather than just AI or just human being the most effective. I could get a lot of value out of this, and I definitely will. I think in the future, rather than starting by going to Stack Overflow or googling something where I hope to see a Stack Overflow result, I think I would start just by asking here and then work from there.

That’s all. So that’s how I think I’m going to be using ChatGPT in my day-to-day SEO tasks. I’d love to hear what you’ve got planned. Let me know. Thanks.

Source link

Continue Reading

MARKETING

What Is a White Paper? [FAQs]

Published

on

What Is a White Paper? [FAQs]

The definition of a whitepaper varies heavily from industry to industry, which can be a little confusing for marketers looking to create one for their business.

The old-school definition comes from politics, where it means a legislative document explaining and supporting a particular political solution.

(more…)

Continue Reading

MARKETING

HubSpot to cut around 7% of workforce by end of Q1

Published

on

HubSpot to cut around 7% of workforce by end of Q1

This afternoon, HubSpot announced it would be making cuts in its workforce during Q1 2023. In a Securities and Exchange Commission filing it put the scale of the cuts at 7%. This would mean losing around 500 employees from its workforce of over 7,000.

The reasons cited were a downward trend in business and a “faster deceleration” than expected following positive growth during the pandemic.

Layoffs follow swift growth. Indeed, the layoffs need to be seen against the background of very rapid growth at the company. The size of the workforce at HubSpot grew over 40% between the end of 2020 and today.

In 2022 it announced a major expansion of its international presence with new operations in Spain and the Netherlands and a plan to expand its Canadian presence in 2023.

Why we care. The current cool down in the martech space, and in tech generally, does need to be seen in the context of startling leaps forward made under pandemic conditions. As the importance of digital marketing and the digital environment in general grew at an unprecedented rate, vendors saw opportunities for growth.

The world is re-adjusting. We may not be seeing a bubble burst, but we are seeing a bubble undergoing some slight but predictable deflation.


Get MarTech! Daily. Free. In your inbox.



About the author

Kim Davis

Kim Davis is the Editorial Director of MarTech. Born in London, but a New Yorker for over two decades, Kim started covering enterprise software ten years ago. His experience encompasses SaaS for the enterprise, digital- ad data-driven urban planning, and applications of SaaS, digital technology, and data in the marketing space.

He first wrote about marketing technology as editor of Haymarket’s The Hub, a dedicated marketing tech website, which subsequently became a channel on the established direct marketing brand DMN. Kim joined DMN proper in 2016, as a senior editor, becoming Executive Editor, then Editor-in-Chief a position he held until January 2020.

Prior to working in tech journalism, Kim was Associate Editor at a New York Times hyper-local news site, The Local: East Village, and has previously worked as an editor of an academic publication, and as a music journalist. He has written hundreds of New York restaurant reviews for a personal blog, and has been an occasional guest contributor to Eater.

Source link

Continue Reading

Trending

en_USEnglish