SEARCHENGINES
Yandex Search Ranking Factors Leaked & Exposed
Yandex had a boatload of its source code across all its technology allegedly leaked by a disgruntled employee and part of that was the source code for Russia’s largest search engine – Yandex. As you can imagine, SEOs and others are diving in and seeing what they can learn from the source code.
I personally did not download the source code, so I did not go through it myself but I wanted to share what people did find via Twitter from their investigations of the source code.
Here’s the alpha version of an explorer tool for the leaked #Yandex Search code.
It lets you browse through the ranking factors, view by tags, etc, and start to find connections.
Easy to add new features if there’s anything you want to see!https://t.co/AjbYnrDl9P pic.twitter.com/pQ4scOkP6w
— Rob Ousbey : @[email protected] (@RobOusbey) January 28, 2023
I downloaded the code, analyzed it and there is a lot of useful information for Google SEO as well. pic.twitter.com/RWrgnnlpj6
— Alex Buraks (@alex_buraks) January 27, 2023
Theoretically, what is the difference between algorithms used in Google and in Yandex?
They are quite similar:
– there is RankBrain analogue – MatrixNet;
– they are using PageRank (almost the same as in Google);
– a lot of text algorithms are the same. pic.twitter.com/Djjl8Bmjwn— Alex Buraks (@alex_buraks) January 27, 2023
According to Statcounter Yandex is close to Yahoo and Bing by market share: pic.twitter.com/5GKIvKIvAo
— Alex Buraks (@alex_buraks) January 27, 2023
Main insights after analysing this list:
#1 Age of links is a ranking factor. pic.twitter.com/U47uWvEq9w
— Alex Buraks (@alex_buraks) January 27, 2023
#3 Numbers in URLs is bad for rankings pic.twitter.com/ECgwGeGUfb
— Alex Buraks (@alex_buraks) January 27, 2023
#5 Hard pessimization equal PR=0 pic.twitter.com/RRbhuJyZr1
— Alex Buraks (@alex_buraks) January 27, 2023
#7 Fun fact – there is a separate ranking factor for uplifting Wikipedia pic.twitter.com/799F8KFpkE
— Alex Buraks (@alex_buraks) January 27, 2023
#9 Document age and last update both are ranking factors. pic.twitter.com/ay1GTMVEtJ
— Alex Buraks (@alex_buraks) January 27, 2023
Right now I checked ~40% of the list, there are a lot more (about text relevancy, behaivor factors, page rank, internal links,etc).
Will continue this thread after some time.
— Alex Buraks (@alex_buraks) January 27, 2023
The first thread got a lot of impressions (500k views for the moment, thanks for you retweets and likes!), so I decided to finalize.https://t.co/UQiQsnpWd2
— Alex Buraks (@alex_buraks) January 28, 2023
#2 Additionnaly: ranking factor for orphan pages.
You can easy find them via Screming Frog or other crawlers. pic.twitter.com/zIPwAelpD0
— Alex Buraks (@alex_buraks) January 28, 2023
#4 Number of search queries of your site/url is a ranking factor.
Obviously more = better. pic.twitter.com/xXQ6FMDghP
— Alex Buraks (@alex_buraks) January 28, 2023
#6 If your url whould be the last for search session (user will find what he needs) – it whould impact rankings.
There are strict factors for this and predictible factors as well. pic.twitter.com/Zx3sBZORCs
— Alex Buraks (@alex_buraks) January 28, 2023
#8 Special ranking factors for short videos (tiktok, shorts, reels) pic.twitter.com/oKPzL09MID
— Alex Buraks (@alex_buraks) January 28, 2023
#10 Keywords in URL is a ranking factors.
As we can see from the description – the optimal would be include up to 3 words from the search query. pic.twitter.com/Q1euKWSiST
— Alex Buraks (@alex_buraks) January 28, 2023
#14 One more ranking factor for content quality – broken embedded video on the page.
Embed videos – good for rankings.
Broken embed videos – bad. pic.twitter.com/2SUys65PHp— Alex Buraks (@alex_buraks) January 28, 2023
#16 If you backlinks anchors contain all words from the keywords – it’s good for SEO.
If it is in a one link – it’s more beneficial. Especially if the order of words is the same. pic.twitter.com/WrbESJ8Da5
— Alex Buraks (@alex_buraks) January 28, 2023
#18 The quality rank of texts on the domain is a ranking factor.
Pages with low quality content affect the entire domain. pic.twitter.com/MJUCTVB9CH
— Alex Buraks (@alex_buraks) January 28, 2023
#20 Funny, there is a random as a separate ranking factor.
When you don’t understant why some of page is on top – it could be just random (to test behaivor factors). pic.twitter.com/TGtzFrmBOV
— Alex Buraks (@alex_buraks) January 28, 2023
#22 Backlinks from the top 100 best websites by PageRank impacts on rankings.
That’s not news. pic.twitter.com/ikxldWLJqy
— Alex Buraks (@alex_buraks) January 28, 2023
Wow, I just found the list with initial weights of Yandex ranking factors.
Do you need one more thread? 😁
P.S. final weights calculated by AI (matrixnet), but initial values are useful as well. pic.twitter.com/WeroYQy7Yu
— Alex Buraks (@alex_buraks) January 28, 2023
That said, I’ve been digging into the codebase myself to find things of interest.
I’m doing this live, so I don’t know how long it will take between tweets.
— Mic King (@iPullRank) January 27, 2023
A lot of the code related to Yandex Search lives in the Kernel, ExtSearch, Search, and Robot archives, but again I won’t be able to be comprehensive here until I’ve looked through everything.
— Mic King (@iPullRank) January 27, 2023
Some really interesting things in the web_meta_factors_info/factors_gen.in file as it relates to content features and factors.
For instance, some things that we’d expect like a minimum expectation of the proximity of words in a title to the words in the query. pic.twitter.com/YRsrCpVsqU
— Mic King (@iPullRank) January 27, 2023
Interestingly, there are a lot of scrapers in here Google News, Shopping, YouTube and even other Yandex services.
— Mic King (@iPullRank) January 27, 2023
Hmm…this might be the structure of how Yandex stores documents in their version of a doc server.
Still looking for an idea of how they structure their inverted index. pic.twitter.com/1lwTbOirnx
— Mic King (@iPullRank) January 27, 2023
Here’s a protobuf of link factors. pic.twitter.com/1RM6o1xzRg
— Mic King (@iPullRank) January 27, 2023
In the “link prioritizer code” they talk about decreasing the priority of links with the same text from the same host. In other words, don’t count the links from duplicate content. pic.twitter.com/dQTUnScCUy
— Mic King (@iPullRank) January 27, 2023
How did y’all come up with that number of ranking factors?
I see 481 factors just related to “Rapid Clicks” pic.twitter.com/sw5A3ia3Bk
— Mic King (@iPullRank) January 28, 2023
Similar to the Googs, Yandex has multiple ranking models to choose from.
In this select_ranking_models.cpp file, they talk about having different models for different languages and locations. pic.twitter.com/m210tpOUDb
— Mic King (@iPullRank) January 28, 2023
I’m gonna go watch TV, but I obviously have to add this to my book so I’m gonna add more over the next couple days
— Mic King (@iPullRank) January 28, 2023
Been digging into how this robot archive is structured.
It looks like the Zora directory is where a lot of interesting things are happening. There’s a limits.pb.txt file that stores the requests per second rate for the host and the IP address for 204k hosts. pic.twitter.com/0oulKm58dx
— Mic King (@iPullRank) January 28, 2023
Here’s where the Document and Query factors are collected and scored.
Looks like it goes to storage after this tho. pic.twitter.com/qJAiLfSrsU
— Mic King (@iPullRank) January 29, 2023
Ok, real quick, top 5 most positively and negatively weighted ranking factors and their coefficients in the initial weighting in Yandex’s document relevance calculation. Negatives first
#1 FI_ADV: -0.2509284637
This factor determines that there is advertising on the site.
— Mic King (@iPullRank) January 29, 2023
#3 FI_QURL_STAT_POWER: -0.1943768768
Factor is the number of URL impressions for the request
— Mic King (@iPullRank) January 29, 2023
#5 FI_GEO_CITY_URL_REGION_COUNTRY: -0.168645758
Factor is the geographical coincidence of the document and the country that the user searched from.
Ok, now for the top 5 positively weighted factors.
— Mic King (@iPullRank) January 29, 2023
Here is a starting point for link related factors.https://t.co/fwP8TxuOrM
— Christoph C. Cemper 🇺🇦 🧡 SEO (@cemper) January 30, 2023
Will this help you do SEO on Google? Probably not but hey, it is super interesting.
Ah, but once they find the optimal word count …
BOOM
— John Mueller is watching out for Google+ 🐀 (@JohnMu) January 29, 2023
Forum discussion at WebmasterWorld.
SEARCHENGINES
Daily Search Forum Recap: April 25, 2024
Here is a recap of what happened in the search forums today, through the eyes of the Search Engine Roundtable and other search forums on the web.
The Google March 2024 core update is still rolling out and the SEO chatter is super heated despite the tools calming. Google Ads API version 16.1 is now out. Google’s John Mueller says splitting and merging sites takes longer than normal site moves for Google to process. Google updated its favicon documentation. And a scathing report on how Google executive Prabhakar Raghavan killed Google Search.
Search Engine Roundtable Stories:
-
Google March Core Update Stilling Rolling Out & Heated SEO Chatter Continue
Over the past few days, while I was offline, the SEO chatter around the Google search ranking volatility continued to be super heated. The Google tracking tools seemed to calm down a bit, but the chatter is still very heated. This is all while the Google March 2024 core update is still rolling out 51 days later. -
Report: How Prabhakar Raghavan Killed Google Search
Ed Zitron wrote a piece named The Man Who Killed Google Search. It goes through in detail how Prabhakar Raghavan, Google’s former head of ads – led a coup so that he could run Google Search, and how an email chain from 2019 began a cascade of events that would lead to him running it into the ground, he said. -
Google Favicon Documentation Adds Rel Attribute Value Definitions
Google has updated its favicon documentation for Google Search to add definitions for each supported rel attribute value in the Google Search favicon documentation. -
Google Ads API Version 16.1 Now Available
Google released version 16.1 of the Google Ads API yesterday. The update includes query assets for Demand Gen, more location service details, more support warnings, Target ROAS bid simulation and more. -
Google: Splitting & Merging Sites Takes Longer Than Normal Site Migrations
Want to scare an SEO? Just tell them they need to manage a site migration. Want to make an SEO faint? Tell them they need to manage to split a site into two or more sites while merging content on those sites. John Mueller from Google said it takes Google longer to process site splits and merges than normal site migrations. -
Google Chefs In Dublin
Here is a photo I found on Instagram of a bunch of chefs at the Google office in Dublin. I am not sure if this was for some event or if Googlers were doing some sort of cooking class but it was a photo that caught my eye.
Other Great Search Threads:
- Interested in AI assistants within YouTube? -> The new experimental “Ask AI” feature in YouTube is pretty cool. Just tap the button and ask any question about the video you’re watching. Note, AI can’t control the video player as of n, Glenn Gabe on X
- What skeleton do you have in your closet?, WebmasterWorld
- Googlebot will crawl from one location (often the US), and if you redirect it based on its location, Googlebot would only see (and index) that country version. It’s better to use something like a banner., John Mueller on X
- I don’t know your sites, but even if the content’s the same, they’re essentially different sites (especially with ccTLDs), so it would be normal for a migration to affect them differently (and this seems to be quite a way back in the meantime)., John Mueller on X
- Search engines recrawl URLs at different rates, sometimes it’s multiple times a day, sometimes it’s once every few months. The verified removal tool is fastest, the public removal tool takes a few days because it needs to verify the URL properly., John Mueller on X
- You are now a Google Search Engineer. How do you fix organic search?, Gareth Boyd on X
Search Engine Land Stories:
Other Great Search Stories:
Analytics
Industry & Business
Links & Content Marketing
Local & Maps
Mobile & Voice
SEO
PPC
- PPC for Retail: Biggest Trends, Challenges, & Strategies for Success, WordStream
- Unlocking Success with Performance Max Campaigns, Location3 Media
- Discovering and Diagnosing a Google AdSense Rendering Bug, Merj
- Google delays third-party cookie demise yet again, Digiday
- How to Find and Use Competitor Keywords, Ahrefs
- Q&A: Promoting your app or game with Apple Search Ads, Apple Developer
- Updates to Healthcare and Medicines Policy (May 2024), Google Advertising Policies Help
- Windows 11 Start menu ads are now rolling out to everyone, The Verge
Search Features
Other Search
Feedback:
Have feedback on this daily recap; let me know on Twitter @rustybrick or @seroundtable, on Threads, Mastodon and Bluesky and you can follow us on Facebook and on Google News and make sure to subscribe to the YouTube channel, Apple Podcasts, Spotify, Google Podcasts or just contact us the old fashion way.
SEARCHENGINES
Google Won’t Change The 301 Signals For Ranking & SEO
Gary Illyes from Google said on stage at the SERP conference last week that there is no way that Google would change how the 301 redirect signal works for SEO or search rankings. Gary added that it’s a very reliable signal.
Nikola Minkov quoted Gary Illyes as saying, “It is a very reliable signal, and there is no way we could change that signal,” when asked if a 301 redirect not working is a myth. Honestly, I am not sure the context of this question, as it is not clear from the post on X, but here it is:
More from @methode:
– 301 redirect not working is a myth. “It is a very reliable signal, and there is no way we could change that signal”.#SERPConf2024#SERPConf2024International— Nikola Minkov (@n_minkov) April 19, 2024
We’ve covered 301 redirects here countless times – but I never saw a myth that Google does not use 301 redirects as a signal for canonicalization or for passing signals from an old URL to the redirected URL.
Forum discussion at X.
Note: This was pre-written and scheduled to be posted today, I am currently offline for Passover.
SEARCHENGINES
Google Again Says Ignore Link Spam Especially To 404 Pages
I am not sure how many times Google has said that you do not need to disavow spammy links, that you can ignore link spam attacks and that links pointing to pages that 404/410 are links that do not count – but John Mueller from Google said it again.
In a thread on X, John Mueller from Google wrote, “if the links are going to URLs that 404 on your site, they’re already dropped.” “They do nothing,” he added, “If there’s no indexable destination URL, there’s no link.”
John then added, “I’d generally ignore link-spam, and definitely ignore link-spam to 404s.”
Asking if it would hurt to disavow, after responding with the messages above, John wrote:
It will do absolutely nothing. I would take the time to rework a holistic & forward-looking strategy for the site overall instead of working on incremental tweaks (other tweaks might do something, but you probably need real change, not tweaks).
Earlier this year we had tons of SEOs notice spammy links to 404 error pages, John said ignore them. In 2021, Google said links to 404 pages do not count, Google also said that in 2012 and many other times.
Plus, outside of links to 404 pages, Google has said to ignore spammy links, time and time again – even the toxic links – ignore them. The messaging around this changed in 2016 when Penguin 4.0 was released and Google began devaluing links over demoting them.
Here are those new posts in context:
I’d say add both. Lol
— Jeremy Rivera (@JeremyRiveraSEO) April 11, 2024
Sure. But also, save yourself the work completely :-).
— John 🧀 … 🧀 (@JohnMu) April 11, 2024
Re-reading your initial post – if the links are going to URLs that 404 on your site, they’re already dropped. They do nothing. If there’s no indexable destination URL, there’s no link. I’d generally ignore link-spam, and definitely ignore link-spam to 404s.
— John 🧀 … 🧀 (@JohnMu) April 11, 2024
… but still… is this a dumb idea?
— Rebekah Edwards (@rebekah_creates) April 11, 2024
It will do absolutely nothing. I would take the time to rework a holistic & forward-looking strategy for the site overall instead of working on incremental tweaks (other tweaks might do something, but you probably need real change, not tweaks).
— John 🧀 … 🧀 (@JohnMu) April 11, 2024
And in general, Google says it ignores spammy links, so you should too (not new) but this post from John Mueller is:
I would just ignore them, Google ignores them too. Sometimes they’re just more visible in tools, but that doesn’t mean they’re a problem.
— John 🧀 … 🧀 (@JohnMu) April 18, 2024
And then also on Mastodon wrote about a similar situation, “Google has 2 decades of practice of ignoring spammy links. There’s no need to do anything for those links.”
Forum discussion at X.
Note: This was pre-written and scheduled to be posted today, I am currently offline for Passover.
-
PPC7 days ago
19 Best SEO Tools in 2024 (For Every Use Case)
-
SEARCHENGINES6 days ago
Daily Search Forum Recap: April 19, 2024
-
WORDPRESS7 days ago
How to Make $5000 of Passive Income Every Month in WordPress
-
WORDPRESS5 days ago
13 Best HubSpot Alternatives for 2024 (Free + Paid)
-
MARKETING6 days ago
Battling for Attention in the 2024 Election Year Media Frenzy
-
SEO7 days ago
25 WordPress Alternatives Best For SEO
-
WORDPRESS6 days ago
7 Best WooCommerce Points and Rewards Plugins (Free & Paid)
-
AFFILIATE MARKETING7 days ago
AI Will Transform the Workplace. Here’s How HR Can Prepare for It.
You must be logged in to post a comment Login