MARKETING
SEO Recap: PageRank – Moz
The author’s views are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.
Have you ever wondered how Moz employees learn internally? Well, here’s your chance to get a sneak peek into never seen before, internal webinar footage with Tom Capper! Learning is important at Moz, and the sharing of information amongst employees is crucial in making sure we stay true to our core values. Knowledge sharing allows us to stay transparent, work together more easily, find better ways of doing things, and create even better tools and experiences for our customers.
Tom started these sessions when everyone was working remotely in 2020. It allowed us to come together again in a special, collaborative way. So, today, we give to you all the gift of learning! In this exclusive webinar, Tom Capper takes us through the crucial topic of PageRank.
Video Transcription
This is actually a topic that I used to put poor, innocent, new recruits through, particularly if they came from a non-marketing background. Even though this is considered by a lot people to be an advanced topic, I think it’s something that actually it makes sense for people who want to learn about SEO to learn first because it’s foundational. And if you think about a lot of other technical SEO and link building topics from this perspective, they make a lot more sense and are simpler and you kind of figure out the answers yourself rather than needing to read 10,000 word blog posts and patents and this kind of thing.
Anyway, hold that thought, because it’s 1998. I am 6 years old, and this is a glorious state-of-the-art video game, and internet browsing that I do in my computer club at school looks a bit like this. I actually didn’t use Yahoo!. I used Excite, which in hindsight was a mistake, but in my defense I was 6.
The one thing you’ll notice about this as a starting point for a journey on the internet, compared to something like Google or whatever you use today, maybe even like something that’s built into your browser these days, there is a lot of links on this page, and mostly there are links to pages with links on this page. It’s kind of like a taxonomy directory system. And this is important because if a lot of people browse the web using links, and links are primarily a navigational thing, then we can get some insights out of looking at links.
They’re a sort of proxy for popularity. If we assume that everyone starts their journey on the internet on Yahoo! in 1998, then the pages that are linked to from Yahoo! are going to get a lot of traffic. They are, by definition, popular, and the pages that those pages link to will also still get quite a lot and so on and so forth. And through this, we could build up some kind of picture of what websites are popular. And popularity is important because if you show popular websites to users in search results, then they will be more trustworthy and credible and likely to be good and this kind of thing.
This is massive oversimplification, bear with me, but this is kind of why Google won. Google recognized this fact, and they came up with an innovation called PageRank, which made their search engine better than other people’s search engines, and which every other search engine subsequently went on to imitate.
However, is anything I said just now relevant 23 years later? We definitely do not primarily navigate the word with links anymore. We use these things called search engines, which Google might know something about. But also we use newsfeeds, which are kind of dynamic and uncrawlable, and all sorts of other non-static, HTML link-based patterns. Links are probably not the majority even of how we navigate our way around the web, except maybe within websites. And Google has better data on popularity anyway. Like Google runs a mobile operating system. They run ISPs. They run a browser. They run YouTube. There are lots of ways for Google to figure out what is and isn’t popular without building some arcane link graph.
However, be that true or not, there still is a core methodology that underpins how Google works on a foundational level. In 1998, it was the case that PageRank was all of how Google worked really. It was just PageRank plus relevance. These days, there’s a lot of nuance and layers on top, and even PageRank itself probably isn’t even called that and probably has changed and been refined and tweaked around the edges. And it might be that PageRank is not used as a proxy for popularity anymore, but maybe as a proxy for trust or something like that and it has a slightly different role in the algorithm.
But the point is we still know purely through empirical evidence that changing how many and what pages link to a page has a big impact on organic performance. So we still know that something like this is happening. And the way that Google talks about how links work and their algorithms still reflects a broadly PageRank-based understanding as do developments in SEO directives and hreflang and rel and this kind of thing. It still all speaks to a PageRank-based ecosystem, if not a PageRank-only ecosystem.
Also, I’m calling it PageRank because that’s what Google calls it, but some other things you should be aware of that SEOs use, link equity I think is a good one to use because it kind of explains what you’re talking about in a useful way. Link flow, it’s not bad, but link flow is alluding to a different metaphor that you’ve probably seen before, where you think of links as being sent through big pipes of liquids that then pour in different amounts into different pages. It’s a different metaphor to the popularity one, and as a result it has some different implications if it’s overstretched, so use some caution. And then linking strength, I don’t really know what metaphor this is trying to do. It doesn’t seem as bad as link juice, at least fine, I guess.
More importantly, how does it work? And I don’t know if anyone here hates maths. If you do, I’m sorry, but there’s going to be maths.
So the initial sort of question is or the foundation of all this is imagine that, so A, in the red box here, that’s a web page to be clear in this diagram, imagine that the whole internet is represented in this diagram, that there’s only one web page, which means this is 1970 something, I guess, what is the probability that a random browser is on this page? We can probably say it’s one or something like that. If you want to have some other take on that, it kind of doesn’t matter because it’s all just going to be based on whatever number that is. From that though, we can sort of try to infer some other things.
So whatever probability you thought that was, and let’s say we thought that if there’s one page on the internet, everyone is on it, what’s the probability a random browser is on the one page, A, links to? So say that we’ve pictured the whole internet here. A is a page that links to another page which links nowhere. And we started by saying that everyone was on this page. Well, what’s the probability now, after a cycle, that everyone will be on this page? Well, we go with the assumption that there’s an 85% chance, and the 85% number comes from Google’s original 1998 white paper. There’s an 85% chance that they go onto this one page in their cycle, and a 15% chance that they do one of these non-browser-based activities. And the reason why we assume that there’s a chance on every cycle that people exit to do non-browser-based activities, it’s because otherwise we get some kind of infinite cycle later on. We don’t need to worry about that. But yeah, the point is that if you assume that people never leave their computers and that they just browse through links endlessly, then you end up assuming eventually that every page has infinite traffic, which is not the case.
That’s the starting point where we have this really simple internet, we have a page with a link on it, and a page without a link on it and that’s it. Something to bear in mind with these systems is, obviously, web pages don’t have our link on them and web pages with no links on them are virtually unheard of, like the one on the right. This gets really complex really fast. If we try to make a diagram just of two pages on the Moz website, it would not fit on the screen. So we’re talking with really simplified versions here, but it doesn’t matter because the principles are extensible.
So what if the page on the left actually linked to two pages, not one? What is the probability now that we’re on one of those two pages? We’re taking that 85% chance that they move on at all without exiting, because the house caught fire, they went for a bike ride or whatever, and we’re now dividing that by two. So we’re saying 42.5% chance that they were on this page, 42.5% chance they were on this page, and then nothing else happens because there are no more links in the world. That’s fine.
What about this page? So if this page now links to one more, how does this page’s strength relates to page A? So this one was 0.85/2, and this one is 0.85 times that number. So note that we are diluting as we go along because we’ve applied that 15% deterioration on every step. This is useful and interesting to us because we can imagine a model in which page A, on the left, is our homepage and the page on the right is some page we want to rank, and we’re diluting with every step that we have to jump to get there. And this is crawl depth, which is a metric that is exposed by Moz Pro and most other technical SEO tools. That’s why crawl depth is something that people are interested in is this, and part of it is discovery, which we won’t get into today, but part of it is also this dilution factor.
And then if this page actually linked to three, then again, each of these pages is only one-third as strong as when it only linked to one. So it’s being split up and diluted the further down we go.
So that all got very complicated very quick on a very simple, fictional website. Don’t panic. The lessons we want to take away from this are quite simple, even though the math becomes very arcane very quickly.
So the first lesson we want to take is that each additional link depth diluted value. So we talked about the reasons for that, but obviously it has implications for site structure. It also has implications in some other things, some other common technical SEO issues that I’ll cover in a bit.
So if I link to a page indirectly that is less effective than linking to a page directly, even in a world where every page only has one link on it, which is obviously an ideal scenario.
The other takeaway we can have is that more links means each link is less valuable. So if every additional link you add to your homepage, you’re reducing the effectiveness of the links that were already there. So this is very important because if you look on a lot of sites right now, you’ll find 600 link mega navs at the top of the page and the same at the bottom of the page and all this kind of thing. And that can be an okay choice. I’m not saying that’s always wrong, but it is a choice and it has dramatic implications.
Some of the biggest changes in SEO performance I’ve ever seen on websites came from cutting back the number of links on the homepage by a factor of 10. If you change a homepage so that it goes from linking to 600 pages to linking to the less than 100 that you actually want to rank, that will almost always have a massive difference, a massive impact, more so than external link building could ever dream of because you’re not going to get that 10 times difference through external link building, unless it’s a startup or something.
Some real-world scenarios. I want to talk about basically some things that SEO tools often flag, that we’re all familiar with talking about as SEO issues or optimizations or whatever, but often we don’t think about why and we definitely don’t think of them as being things that hark back quite so deep into Google’s history.
So a redirect is a link, the fictional idea of a page with one link on it is a redirect, because a redirect is just a page that links to exactly one other page. So in this scenario, the page on the left could have linked directly to the page on the top right, but because it didn’t, we’ve got this 0.85 squared here, which is 0.7225. The only thing you need to know about that is that it’s a smaller number than 0.85. Because we didn’t link directly, we went through this page here that redirected, which doesn’t feel like a link, but is a link in this ecosystem, we’ve just arbitrarily decided to dilute the page at the end of the cycle. And this is, obviously, particularly important when we think about chain redirects, which is another thing that’s often flagged by the SEO tools.
But when you look in an issue report in something like Moz Pro and it gives you a list of redirects as if they’re issues, that can be confusing because a redirect is something we’re also told is a good thing. Like if we have a URL that’s no longer in use, it should redirect. But the reason that issue is being flagged is we shouldn’t still be linking to the URL that redirects. We should be linking directly to the thing at the end of the chain. And this is why. It’s because of this arbitrary dilution that we’re inserting into our own website, which is basically just a dead weight loss. If you imagine that in reality, pages do tend to link back to each other, this will be a big complex web and cycle that is, and I think this is where the flow thing comes around because people can imagine a flow of buckets that drip round into each other but leak a little bit at every step, and then you get less and less water, unless there’s some external source. If you imagine these are looping back around, then inserting redirects is just dead weight loss. We’ve drilled a hole in the bottom of a bucket.
So, yeah, better is a direct link. Worse is a 302, although that’s a controversial subject, who knows. Google sometimes claim that they treat 302s as 301s these days. Let’s not get into that.
Canonicals, very similar, a canonical from a PageRank perspective. A canonical is actually a much later addition to search engines. But a canonical is basically equivalent to a 301 redirect. So if we have this badgers page, which has two versions, so you can access it by going to badgers?colour=brown. Or so imagine I have a website that sells live badgers for some reason in different colors, and then I might have these two different URL variants for my badger e-com page filtered to brown. And I’ve decided that this one without any parameters is the canonical version, literally and figuratively speaking. If the homepage links to it via this parameter page, which then has canonical tag pointing at the correct version, then I’ve arbitrarily weakened the correct version versus what I could have done, which would be the direct link through. Interestingly, if we do have this direct link through, note that this page now has no strength at all. It now has no inbound links, and also it probably wouldn’t get flagged as an error in the tool because the tool wouldn’t find it.
You’ll notice I put a tilde before the number zero. We’ll come to that.
PageRank sculpting is another thing that I think is interesting because people still try to do it even though it’s not worked for a really long time. So this is an imaginary scenario that is not imaginary at all. It’s really common, Moz probably has this exact scenario, where your homepage links to some pages you care about and also some pages you don’t really care about, certainly from an SEO perspective, such as your privacy policy. Kind of sucks because, in this extreme example here, having a privacy policy has just randomly halved the strength of a page you care about. No one wants that.
So what people used to do was they would use a link level nofollow. They use a link level nofollow, which . . . So the idea was, and it worked at the time, and by at the time, I mean like 2002 or something. But people still try this on new websites today. The idea was that effectively the link level nofollow removed this link, so it was as if your homepage only linked to one page. Great, everyone is a winner.
Side note I talked about before. So no page actually has zero PageRank. A page with no links in the PageRank model has the PageRank one over the number of pages on the internet. That’s the seeding probability that before everything starts going and cycles round and figures out what the stable equilibrium PageRank is, they assume that there’s an equal chance you’re on any page on the internet. One divided by the number of pages on the internet is a very small number, so we can think of it as zero.
This was changed, our level nofollow hack was changed again a very, very long time ago such that if you use a link level nofollow, and by the way, this is also true if you use robots.txt to do this, this second link will still be counted in when we go here and we have this divided by two to say we are halving, there’s an equal chance that you go to either of these pages. This page still gets that reduction because it was one of two links, but this page at the bottom now has no strength at all because it was only linked through a nofollow. So if you do this now, it’s a worst of both world scenario. And you might say, “Oh, I don’t actually care whether my privacy policy has zero strength,” whatever. But you do care because your privacy policy probably links through the top nav to every other page on your website. So you’re still doing yourself a disservice.
Second side note, I said link level nofollow, meaning nofollow in the HTML is an attribute to a link. There is also page level nofollow, which I struggled to think of a single good use case for. Basically, a page level nofollow means we are going to treat every single link on this page as nofollow. So we’re just going to create a PageRank dead-end. This is a strange thing to do. Sometimes people use robots.txt, which basically does the same thing. If I block this page with robota.txt, that’s the same in terms of the PageRank consequences, except there are other good reasons to do that, like I might not want Google to ever see this, or I might want to prevent a massive waste of Google’s crawlers’ time so that they spend more time crawling the rest of my site or something like this. There are reasons to use robots.txt. Page level nofollow is we’re going to create that dead-end, but also we’re going to waste Google’s time crawling it anyway.
Some of the extreme scenarios I just talked about, particularly the one with the privacy policy, changed a lot for the better for everyone in 2004 with something called reasonable surfer, which you occasionally still hear people talking about now, but mostly implicitly. And it is probably actually an under-discussed or underheld in mind topic.
So these days, and by these days, I mean for the last 17 years, if one of these links was that massive call to action and another one of these links was in the footer, like a privacy policy link often is, then Google will apply some sense and say the chance people click on this one . . . Google was trying to figure out probabilities here, remember. So we’ll split this. This 0.9 and 0.1 still have to add up to 1, but we’ll split them in a more reasonable fashion. Yeah, they were doing that a long time ago. They’ve probably got very, very good at it by now.
Noindex is an interesting one because, traditionally, you would think that has nothing to do with PageRank. So, yeah, a noindex tag just means this should never show up in search results, this page at the bottom, which is fine. There are some valid reasons to do that. Maybe you’re worried that it will show up for the wrong query that something else on your site is trying to show up for, or maybe it contains sensitive information or something like this. Okay, fine. However, when you put a noindex tag on something, Google eventually stops crawling it. Everyone sort of intuitively knew all the pieces of this puzzle, but Google only acknowledged that this behavior is what happens a couple of years ago.
So Google eventually stops crawling it, and when Google stops crawling on it, it stops passing PageRank. So noindex follow, which used to be quite a good thing or we thought quite a good thing to do for a page like an HTML sitemap page or something like that, like an HTML sitemap page, clearly you don’t want to show up in search results because it’s kind of crap and a poor reflection on your site and not a good UX and this kind of thing. But it is a good way to pass equity through to a bunch of deep pages, or so we thought. It turns out probably not. It was equivalent to that worst case scenario, page level nofollow in the long run that we talked about earlier. And again, this is probably why noindex is flagged as an error in tools like Moz Pro, although often it’s not well explained or understood.
My pet theory on how links work is that, at this stage, they’re no longer a popularity proxy because there’s better ways of doing that. But they are a brand proxy for a frequently cited brand. Citation and link are often used synonymously in this industry, so that kind of makes sense. However, once you actually start ranking in the top 5 or 10, my experience is that links become less and less relevant the more and more competitive a position you’re in because Google has increasingly better data to figure out whether people want to click on you or not. This is some data from 2009, contrasting ranking correlations in positions 6 to 10, versus positions 1 to 5. Basically, both brand and link become less relevant, or the easily measured versions become less relevant, which again is kind of exploring that theory that the higher up you rank, the more bespoke and user signal-based it might become.
This is some older data, where I basically looked at to what extent you can use Domain Authority to predict rankings, which is this blue bar, to what extent you could use branded search volume to predict rankings, which is this green bar, and to what extent you could use a model containing them both to predict rankings, which is not really any better than just using branded search volume. This is obviously simplified and flawed data, but this is some evidence towards the hypothesis that links are used as a brand proxy.
MARKETING
YouTube Ad Specs, Sizes, and Examples [2024 Update]
Introduction
With billions of users each month, YouTube is the world’s second largest search engine and top website for video content. This makes it a great place for advertising. To succeed, advertisers need to follow the correct YouTube ad specifications. These rules help your ad reach more viewers, increasing the chance of gaining new customers and boosting brand awareness.
Types of YouTube Ads
Video Ads
- Description: These play before, during, or after a YouTube video on computers or mobile devices.
- Types:
- In-stream ads: Can be skippable or non-skippable.
- Bumper ads: Non-skippable, short ads that play before, during, or after a video.
Display Ads
- Description: These appear in different spots on YouTube and usually use text or static images.
- Note: YouTube does not support display image ads directly on its app, but these can be targeted to YouTube.com through Google Display Network (GDN).
Companion Banners
- Description: Appears to the right of the YouTube player on desktop.
- Requirement: Must be purchased alongside In-stream ads, Bumper ads, or In-feed ads.
In-feed Ads
- Description: Resemble videos with images, headlines, and text. They link to a public or unlisted YouTube video.
Outstream Ads
- Description: Mobile-only video ads that play outside of YouTube, on websites and apps within the Google video partner network.
Masthead Ads
- Description: Premium, high-visibility banner ads displayed at the top of the YouTube homepage for both desktop and mobile users.
YouTube Ad Specs by Type
Skippable In-stream Video Ads
- Placement: Before, during, or after a YouTube video.
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Vertical: 9:16
- Square: 1:1
- Length:
- Awareness: 15-20 seconds
- Consideration: 2-3 minutes
- Action: 15-20 seconds
Non-skippable In-stream Video Ads
- Description: Must be watched completely before the main video.
- Length: 15 seconds (or 20 seconds in certain markets).
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Vertical: 9:16
- Square: 1:1
Bumper Ads
- Length: Maximum 6 seconds.
- File Format: MP4, Quicktime, AVI, ASF, Windows Media, or MPEG.
- Resolution:
- Horizontal: 640 x 360px
- Vertical: 480 x 360px
In-feed Ads
- Description: Show alongside YouTube content, like search results or the Home feed.
- Resolution:
- Horizontal: 1920 x 1080px
- Vertical: 1080 x 1920px
- Square: 1080 x 1080px
- Aspect Ratio:
- Horizontal: 16:9
- Square: 1:1
- Length:
- Awareness: 15-20 seconds
- Consideration: 2-3 minutes
- Headline/Description:
- Headline: Up to 2 lines, 40 characters per line
- Description: Up to 2 lines, 35 characters per line
Display Ads
- Description: Static images or animated media that appear on YouTube next to video suggestions, in search results, or on the homepage.
- Image Size: 300×60 pixels.
- File Type: GIF, JPG, PNG.
- File Size: Max 150KB.
- Max Animation Length: 30 seconds.
Outstream Ads
- Description: Mobile-only video ads that appear on websites and apps within the Google video partner network, not on YouTube itself.
- Logo Specs:
- Square: 1:1 (200 x 200px).
- File Type: JPG, GIF, PNG.
- Max Size: 200KB.
Masthead Ads
- Description: High-visibility ads at the top of the YouTube homepage.
- Resolution: 1920 x 1080 or higher.
- File Type: JPG or PNG (without transparency).
Conclusion
YouTube offers a variety of ad formats to reach audiences effectively in 2024. Whether you want to build brand awareness, drive conversions, or target specific demographics, YouTube provides a dynamic platform for your advertising needs. Always follow Google’s advertising policies and the technical ad specs to ensure your ads perform their best. Ready to start using YouTube ads? Contact us today to get started!
MARKETING
Why We Are Always ‘Clicking to Buy’, According to Psychologists
Amazon pillows.
MARKETING
A deeper dive into data, personalization and Copilots
Salesforce launched a collection of new, generative AI-related products at Connections in Chicago this week. They included new Einstein Copilots for marketers and merchants and Einstein Personalization.
To better understand, not only the potential impact of the new products, but the evolving Salesforce architecture, we sat down with Bobby Jania, CMO, Marketing Cloud.
Dig deeper: Salesforce piles on the Einstein Copilots
Salesforce’s evolving architecture
It’s hard to deny that Salesforce likes coming up with new names for platforms and products (what happened to Customer 360?) and this can sometimes make the observer wonder if something is brand new, or old but with a brand new name. In particular, what exactly is Einstein 1 and how is it related to Salesforce Data Cloud?
“Data Cloud is built on the Einstein 1 platform,” Jania explained. “The Einstein 1 platform is our entire Salesforce platform and that includes products like Sales Cloud, Service Cloud — that it includes the original idea of Salesforce not just being in the cloud, but being multi-tenancy.”
Data Cloud — not an acquisition, of course — was built natively on that platform. It was the first product built on Hyperforce, Salesforce’s new cloud infrastructure architecture. “Since Data Cloud was on what we now call the Einstein 1 platform from Day One, it has always natively connected to, and been able to read anything in Sales Cloud, Service Cloud [and so on]. On top of that, we can now bring in, not only structured but unstructured data.”
That’s a significant progression from the position, several years ago, when Salesforce had stitched together a platform around various acquisitions (ExactTarget, for example) that didn’t necessarily talk to each other.
“At times, what we would do is have a kind of behind-the-scenes flow where data from one product could be moved into another product,” said Jania, “but in many of those cases the data would then be in both, whereas now the data is in Data Cloud. Tableau will run natively off Data Cloud; Commerce Cloud, Service Cloud, Marketing Cloud — they’re all going to the same operational customer profile.” They’re not copying the data from Data Cloud, Jania confirmed.
Another thing to know is tit’s possible for Salesforce customers to import their own datasets into Data Cloud. “We wanted to create a federated data model,” said Jania. “If you’re using Snowflake, for example, we more or less virtually sit on your data lake. The value we add is that we will look at all your data and help you form these operational customer profiles.”
Let’s learn more about Einstein Copilot
“Copilot means that I have an assistant with me in the tool where I need to be working that contextually knows what I am trying to do and helps me at every step of the process,” Jania said.
For marketers, this might begin with a campaign brief developed with Copilot’s assistance, the identification of an audience based on the brief, and then the development of email or other content. “What’s really cool is the idea of Einstein Studio where our customers will create actions [for Copilot] that we hadn’t even thought about.”
Here’s a key insight (back to nomenclature). We reported on Copilot for markets, Copilot for merchants, Copilot for shoppers. It turns out, however, that there is just one Copilot, Einstein Copilot, and these are use cases. “There’s just one Copilot, we just add these for a little clarity; we’re going to talk about marketing use cases, about shoppers’ use cases. These are actions for the marketing use cases we built out of the box; you can build your own.”
It’s surely going to take a little time for marketers to learn to work easily with Copilot. “There’s always time for adoption,” Jania agreed. “What is directly connected with this is, this is my ninth Connections and this one has the most hands-on training that I’ve seen since 2014 — and a lot of that is getting people using Data Cloud, using these tools rather than just being given a demo.”
What’s new about Einstein Personalization
Salesforce Einstein has been around since 2016 and many of the use cases seem to have involved personalization in various forms. What’s new?
“Einstein Personalization is a real-time decision engine and it’s going to choose next-best-action, next-best-offer. What is new is that it’s a service now that runs natively on top of Data Cloud.” A lot of real-time decision engines need their own set of data that might actually be a subset of data. “Einstein Personalization is going to look holistically at a customer and recommend a next-best-action that could be natively surfaced in Service Cloud, Sales Cloud or Marketing Cloud.”
Finally, trust
One feature of the presentations at Connections was the reassurance that, although public LLMs like ChatGPT could be selected for application to customer data, none of that data would be retained by the LLMs. Is this just a matter of written agreements? No, not just that, said Jania.
“In the Einstein Trust Layer, all of the data, when it connects to an LLM, runs through our gateway. If there was a prompt that had personally identifiable information — a credit card number, an email address — at a mimum, all that is stripped out. The LLMs do not store the output; we store the output for auditing back in Salesforce. Any output that comes back through our gateway is logged in our system; it runs through a toxicity model; and only at the end do we put PII data back into the answer. There are real pieces beyond a handshake that this data is safe.”
You must be logged in to post a comment Login