The recent code fragments leak from Russian search engine giant Yandex has sent shockwaves across the SEO community worldwide. As reported around news agencies, almost 50GB worth stolen data from the world’s fourth largest search engine was leaked into the public domain. According to experts, the leak from the company will offer some interesting insights into the functioning of search engines and how the SEO market will be impacted by it.

The leak took place sometime on January 25. Several files that were reportedly stolen in July last year from the company’s repository from February 2022 were part of the code leak. Incidentally, the repository’s creation coincides with the time when Russia invaded Ukraine. The source code files were reportedly leaked by a disgruntled former employee of the Russian tech giant.

The leaker posted a magnet link claiming it to be ‘Yandex git sources’. The code repositories reportedly contained all major source codes to Yandex’s services. Following the development, the company issued a statement saying, “Yandex was not hacked. Our security service found code fragments from an internal repository in the public domain, but the content differs from the current version of the repository used in Yandex services.” The company also said that it was conducting an internal investigation into the reasons for the leak.

What is the Yandex leak about?

Even as the company continues to brush aside the code leak that happened via Torrent, there can be a lot of useful information about how Yandex operates its search engine. The Torrent has not yielded any data other than the source code of all Yandex services. However, several SEO experts have taken to Twitter to share their findings.

In his website, Arseniy Shestakov, co-founder of the game development company Hack The Publisher, posted the list of major Yandex services whose source codes were a part of the leak. The list includes search engine and indexing bots; Maps similar to Google Maps and Street View; Alice – a voice assistant like Alexa; Taxi – an Uber-like service; Direct similar to Google Ads; email service – Mail; file storage service – Disk; Travel, a tour service similar to Booking.com; Yandex360 a service akin to Google Workspace; Pay – a payment processing service like Stripe; Metrika – a service similar to Google Analytics. The recent code leaks reportedly comprises all these services.

Based on the documentation available in the public domain, Yandex’s codebase was combined into a single large repository named Arcadia in 2013. The leaked codebase is essentially a subsection of all projects that fall under Arcadia. Components related to search engines such as Kernel, Search, Robot, Library, etc., were found among the leaked files.

How can the Yandex leak impact the SEO industry

Ever since the leak, the SEO industry has been giving mixed signals with some hailing it, and others terming it barely consequential. The leaks featured 1922 search ranking factors which according to SEO expert, Alex Buraks, is the most interesting part for the SEO community.

You probably heard about Yandex, it’s the 4th biggest search engine by market share worldwide. Yesterday proprietary source code of Yandex was leaked. The most interesting part for SEO community is: the list of all 1922 ranking factors used in the search algorithm [🧵THREAD] pic.twitter.com/6x82AAmbON — Alex Buraks (@alex_buraks) January 27, 2023

Igor Rudnyk, a SEO expert from Ukraine, took to his Twitter account to list his top takeaways for backlinks from the Yandex leaked files. His learnings from the episode include – emphasis on the growth of referring domains and backlinks; significance of the number of links from the main pages; importance of anchor text and exact word order on urls; long text without links is unfavourable; traffic from Wikipedia is important; local backlinks are key to country SERP, etc.

#5 Dirty hacks It's funny, that only 2 sites so important that have separate factors) I'm sure that the first one you would predict. That's right, it's wikipedia. And the second one is livejournal) pic.twitter.com/po5qKx9AaS — Igor Rudnyk🇺🇦 (@IRudnyk) January 29, 2023

Yandex vs Google

Yandex and Google are similar to each other in theory, as they follow similar algorithms. According to Buraks, Yandex uses PageRank in the same way as Google and it consists of a lot of similar text algorithms. Yandex was built as an analogue to Google and SEO specialists in Russia deployed similar white hat SEO techniques for Yandex and Google. While there are a lot of technical differences, the approach and major ranking factors appear to be similar according to Buraks. There seems to be a 70 percent match between the search results on Google and Yandex. When it comes to market share, Yandex is closer to Yahoo and Bing.

Yandex was founded by Arkady Volozh, Arkady Borkovsky, and Ilya Segalovich in 1997. Apart from being a search engine, it offers several other internet-related products and services.

The latest leak from a Russian company which is as big as Google, Amazon or Netflix, comes at a time when Russia is facing an unprecedented rise in cyber attacks. In a recent survey released by Swedish VPN services company Surfshark, Russia was found to be the nation with the most cyber breaches in the world in 2022.