MARKETING
What is a Web Crawler? (In 50 Words or Less)
When it comes to technical SEO, it can be difficult to understand how it all works. But it’s important to gain as much knowledge as we can to optimize our websites and reach larger audiences. One tool that plays a large role in search engine optimization is none other than the web crawler.
In this post, we’ll learn what web crawlers are, how they work, and why they should crawl your site.
What is a web crawler
A web crawler — also known as a web spider — is a bot that searches and indexes content on the internet. Essentially, web crawlers are responsible for understanding the content on a web page so they can retrieve it when an inquiry is made.
You might be wondering, “Who runs these web crawlers?”
Usually, web crawlers are operated by search engines with their own algorithms. The algorithm will tell the web crawler how to find relevant information in response to a search query.
A web spider will search (crawl) and categorize all web pages on the internet that it can find and is told to index. So you can tell a web crawler not to crawl your web page if you don’t want it to be found on search engines.
To do this, you’d upload a robots.txt file. Essentially, a robots.txt file will tell a search engine how to crawl and index the pages on your site.
For example, let’s take a look at Nike.com/robots.txt.
Nike used its robot.txt file to determine which links in its website would be crawled and indexed.
In this portion of the file, it determined that:
- The web crawler Baiduspider was allowed to crawl the first seven links
- The web crawler Baiduspider was disallowed to crawl the remaining three links
This is beneficial for Nike because some pages the company has aren’t meant to be searched, and the disallowed links won’t affect its optimized pages that help them rank in search engines.
So now that we know what web crawlers are, how do they do their job? Below, let’s review how web crawlers work.
How do web crawlers work?
A web crawler works by discovering URLs and reviewing and categorizing web pages. Along the way, they find hyperlinks to other webpages and add them to the list of pages to crawl next. Web crawlers are smart and can determine the importance of each web page.
A search engine’s web crawler most likely won’t crawl the entire internet. Rather, it will decide the importance of each web page based on factors including how many other pages link to that page, page views, and even brand authority. So, a web crawler will determine which pages to crawl, what order to crawl them in, and how often they should crawl for updates.
For example, if you have a new web page, or changes were made on an existing page, then the web crawler will take note and update the index. Or, if you have a new web page, you can ask search engines to crawl your site.
When the web crawler is on your page, it looks at the copy and meta tags, stores that information, and indexes it for Google to sort through for keywords.
Before this entire process is started, the web crawler will look at your robots.txt file to see which pages to crawl, which is why it’s so important for technical SEO.
Ultimately, when a web crawler crawls your page, it decides whether your page will show up on the search results page for a query. It’s important to note that some web crawlers might behave differently than others. For example, some might use different factors when deciding which web pages are most important to crawl.
Now that we’ve gone over how web crawlers work, we’ll discuss why they should crawl your website.
Why is website crawling important?
If you want your website to rank in search engines, it needs to be indexed. Without a web crawler, your website won’t be found even if you search for over a paragraph directly taken from your website.
In a simple sense, your website cannot be found organically unless it’s crawled once.
To find and discover links on the web across search engines, you must give your site the ability to reach the audience it’s meant for by having it crawled — especially if you want to increase your organic traffic.
If the technical aspect of this is confusing, I understand. That’s why HubSpot has a Website Optimization Course that puts technical topics into simple language and instructs you on how to implement your own solutions or discuss them with your web expert.
How and Why to Crawl Your Site
If your site has errors making it difficult to crawl, it could fall lower in SERP rankings. You work hard on your business and content, but – as mentioned above – no one will know how great your site is if they can’t find it online.
Luckily there are crawling tools like Screaming Frog and Deepcrawl that can shed light on the health of your website. Performing a site audit with a crawling tool can help you find common errors and identify issues such as:
-
Broken links: When links go to a page that no longer exists, it doesn’t just provide a poor user experience, but it also can harm your rankings in the SERPs.
-
Duplicate content: Duplicate content across different URLs makes it difficult for Google (or other search engines) to choose which version is the most relevant to a user’s search query. One option to remedy this is to combine them using a 301 redirect.
-
Page titles: Duplicate, missing, too long, or too short title tags all affect how your page ranks.
You can’t fix problems on your site unless you know what they are. Using a web crawling tool takes the guesswork out of evaluating your site.
Types of Web Crawling Tools
There are plenty of tools on the market to choose from with various features, but they all fall into two categories:
The type of tool you use will depend on your team’s needs and budget. Generally, choosing a cloud-based option will allow for more collaboration since the program won’t need to be stored on an individual’s device.
Once installed, you can set crawlers to run at a given interval and generate reports as needed.
Benefits of Using Web Crawling Tools
Having your site crawled properly is essential to SEO. In addition to diagnosing site errors, benefits of using a web crawling tool include:
1. Doesn’t Affect Site Performance
Site crawlers run in the background and won’t slow down your site when in use. They won’t interfere with your day-to-day tasks or have an effect on those browsing your site.
2. Built-in Reporting
Most crawlers have built-in reporting or analytics features and allow you to export these reports into an excel spreadsheet or other formats. This feature saves time and allows you to quickly dig into the results of your audit.
3. Utilizes Automation
A great feature of web crawlers is that you can set a cadence to have them crawl your site. This allows you to regularly track site performance without having to manually pull a crawl report each time.
Performing regular site audits with a crawling tool is a great way to ensure your site is in good health and ranking as it should.
Expand Your Reach With Web Crawling
Web crawlers are responsible for searching and indexing content online for search engines. They work by sorting and filtering through web pages so search engines understand what every web page is about. Understanding web crawlers is just one part of effective technical SEO that can improve your website’s performance significantly.
This article was originally published July 15, 2021, and has been updated for comprehensiveness.
Source link
MARKETING
YouTube Ad Specs, Sizes, and Examples [2024 Update]
Introduction
With billions of users each month, YouTube is the world’s second largest search engine and top website for video content. This makes it a great place for advertising. To succeed, advertisers need to follow the correct YouTube ad specifications. These rules help your ad reach more viewers, increasing the chance of gaining new customers and boosting brand awareness.
Types of YouTube Ads
Video Ads
- Description: These play before, during, or after a YouTube video on computers or mobile devices.
- Types:
- In-stream ads: Can be skippable or non-skippable.
- Bumper ads: Non-skippable, short ads that play before, during, or after a video.
Display Ads
- Description: These appear in different spots on YouTube and usually use text or static images.
- Note: YouTube does not support display image ads directly on its app, but these can be targeted to YouTube.com through Google Display Network (GDN).
Companion Banners
- Description: Appears to the right of the YouTube player on desktop.
- Requirement: Must be purchased alongside In-stream ads, Bumper ads, or In-feed ads.
In-feed Ads
- Description: Resemble videos with images, headlines, and text. They link to a public or unlisted YouTube video.
Outstream Ads
- Description: Mobile-only video ads that play outside of YouTube, on websites and apps within the Google video partner network.
Masthead Ads
- Description: Premium, high-visibility banner ads displayed at the top of the YouTube homepage for both desktop and mobile users.
YouTube Ad Specs by Type
Skippable In-stream Video Ads
- Placement: Before, during, or after a YouTube video.
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Vertical: 9:16
- Square: 1:1
- Length:
- Awareness: 15-20 seconds
- Consideration: 2-3 minutes
- Action: 15-20 seconds
Non-skippable In-stream Video Ads
- Description: Must be watched completely before the main video.
- Length: 15 seconds (or 20 seconds in certain markets).
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Vertical: 9:16
- Square: 1:1
Bumper Ads
- Length: Maximum 6 seconds.
- File Format: MP4, Quicktime, AVI, ASF, Windows Media, or MPEG.
- Resolution:
- Horizontal: 640 x 360px
- Vertical: 480 x 360px
In-feed Ads
- Description: Show alongside YouTube content, like search results or the Home feed.
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Square: 1:1
- Length:
- Awareness: 15-20 seconds
- Consideration: 2-3 minutes
- Headline/Description:
- Headline: Up to 2 lines, 40 characters per line
- Description: Up to 2 lines, 35 characters per line
Display Ads
- Description: Static images or animated media that appear on YouTube next to video suggestions, in search results, or on the homepage.
- Image Size: 300×60 pixels.
- File Type: GIF, JPG, PNG.
- File Size: Max 150KB.
- Max Animation Length: 30 seconds.
Outstream Ads
- Description: Mobile-only video ads that appear on websites and apps within the Google video partner network, not on YouTube itself.
- Logo Specs:
- Square: 1:1 (200 x 200px).
- File Type: JPG, GIF, PNG.
- Max Size: 200KB.
Masthead Ads
- Description: High-visibility ads at the top of the YouTube homepage.
- Resolution: 1920 x 1080 or higher.
- File Type: JPG or PNG (without transparency).
Conclusion
YouTube offers a variety of ad formats to reach audiences effectively in 2024. Whether you want to build brand awareness, drive conversions, or target specific demographics, YouTube provides a dynamic platform for your advertising needs. Always follow Google’s advertising policies and the technical ad specs to ensure your ads perform their best. Ready to start using YouTube ads? Contact us today to get started!
MARKETING
Why We Are Always ‘Clicking to Buy’, According to Psychologists
Amazon pillows.
MARKETING
A deeper dive into data, personalization and Copilots
Salesforce launched a collection of new, generative AI-related products at Connections in Chicago this week. They included new Einstein Copilots for marketers and merchants and Einstein Personalization.
To better understand, not only the potential impact of the new products, but the evolving Salesforce architecture, we sat down with Bobby Jania, CMO, Marketing Cloud.
Dig deeper: Salesforce piles on the Einstein Copilots
Salesforce’s evolving architecture
It’s hard to deny that Salesforce likes coming up with new names for platforms and products (what happened to Customer 360?) and this can sometimes make the observer wonder if something is brand new, or old but with a brand new name. In particular, what exactly is Einstein 1 and how is it related to Salesforce Data Cloud?
“Data Cloud is built on the Einstein 1 platform,” Jania explained. “The Einstein 1 platform is our entire Salesforce platform and that includes products like Sales Cloud, Service Cloud — that it includes the original idea of Salesforce not just being in the cloud, but being multi-tenancy.”
Data Cloud — not an acquisition, of course — was built natively on that platform. It was the first product built on Hyperforce, Salesforce’s new cloud infrastructure architecture. “Since Data Cloud was on what we now call the Einstein 1 platform from Day One, it has always natively connected to, and been able to read anything in Sales Cloud, Service Cloud [and so on]. On top of that, we can now bring in, not only structured but unstructured data.”
That’s a significant progression from the position, several years ago, when Salesforce had stitched together a platform around various acquisitions (ExactTarget, for example) that didn’t necessarily talk to each other.
“At times, what we would do is have a kind of behind-the-scenes flow where data from one product could be moved into another product,” said Jania, “but in many of those cases the data would then be in both, whereas now the data is in Data Cloud. Tableau will run natively off Data Cloud; Commerce Cloud, Service Cloud, Marketing Cloud — they’re all going to the same operational customer profile.” They’re not copying the data from Data Cloud, Jania confirmed.
Another thing to know is tit’s possible for Salesforce customers to import their own datasets into Data Cloud. “We wanted to create a federated data model,” said Jania. “If you’re using Snowflake, for example, we more or less virtually sit on your data lake. The value we add is that we will look at all your data and help you form these operational customer profiles.”
Let’s learn more about Einstein Copilot
“Copilot means that I have an assistant with me in the tool where I need to be working that contextually knows what I am trying to do and helps me at every step of the process,” Jania said.
For marketers, this might begin with a campaign brief developed with Copilot’s assistance, the identification of an audience based on the brief, and then the development of email or other content. “What’s really cool is the idea of Einstein Studio where our customers will create actions [for Copilot] that we hadn’t even thought about.”
Here’s a key insight (back to nomenclature). We reported on Copilot for markets, Copilot for merchants, Copilot for shoppers. It turns out, however, that there is just one Copilot, Einstein Copilot, and these are use cases. “There’s just one Copilot, we just add these for a little clarity; we’re going to talk about marketing use cases, about shoppers’ use cases. These are actions for the marketing use cases we built out of the box; you can build your own.”
It’s surely going to take a little time for marketers to learn to work easily with Copilot. “There’s always time for adoption,” Jania agreed. “What is directly connected with this is, this is my ninth Connections and this one has the most hands-on training that I’ve seen since 2014 — and a lot of that is getting people using Data Cloud, using these tools rather than just being given a demo.”
What’s new about Einstein Personalization
Salesforce Einstein has been around since 2016 and many of the use cases seem to have involved personalization in various forms. What’s new?
“Einstein Personalization is a real-time decision engine and it’s going to choose next-best-action, next-best-offer. What is new is that it’s a service now that runs natively on top of Data Cloud.” A lot of real-time decision engines need their own set of data that might actually be a subset of data. “Einstein Personalization is going to look holistically at a customer and recommend a next-best-action that could be natively surfaced in Service Cloud, Sales Cloud or Marketing Cloud.”
Finally, trust
One feature of the presentations at Connections was the reassurance that, although public LLMs like ChatGPT could be selected for application to customer data, none of that data would be retained by the LLMs. Is this just a matter of written agreements? No, not just that, said Jania.
“In the Einstein Trust Layer, all of the data, when it connects to an LLM, runs through our gateway. If there was a prompt that had personally identifiable information — a credit card number, an email address — at a mimum, all that is stripped out. The LLMs do not store the output; we store the output for auditing back in Salesforce. Any output that comes back through our gateway is logged in our system; it runs through a toxicity model; and only at the end do we put PII data back into the answer. There are real pieces beyond a handshake that this data is safe.”
-
WORDPRESS6 days ago
How to Connect Your WordPress Site to the Fediverse – WordPress.com News
-
SEARCHENGINES7 days ago
Daily Search Forum Recap: September 12, 2024
-
SEARCHENGINES6 days ago
Daily Search Forum Recap: September 13, 2024
-
SEO6 days ago
SEO Experts Gather for a Candid Chat About Search [Podcast]
-
SEO5 days ago
The Expert SEO Guide To URL Parameter Handling
-
SEO7 days ago
OpenAI Claims New “o1” Model Can Reason Like A Human
-
SEO7 days ago
How to Build a Fandom by Talent-Scouting Great Content
-
SEO3 days ago
9 HTML Tags (& 11 Attributes) You Must Know for SEO
You must be logged in to post a comment Login