7 Things We Learned From the Google Search Document Leak (2024)

The Google Search document leak is big news in the world of SEO. In this post, we’re breaking through the hype to get to the bottom of what it all means at ground level for businesses that rely on the web for sales, and sharing what we believe are the top pieces of actionable advice that can be gleaned from the leak.

In case you missed it, Google had a rather large Search document leak occur recently and SEOs have been having a field day trying to make sense of it all.

There’s been a massive amount of hype around it, and much of the feedback from the many industry insiders who’ve been unravelling the contents of the leaked Google API documents have been returned on a deeply technical level.

We have of course taken a deep dive into the fallout to see what there was to learn from it all. Now we’re clear on the most important aspects, we’re ready to sum up for you – in plain English – the essential parts you need to know, and more importantly, how they might affect your business in the online world.

What is the Google Search document leak?

On March 13th 2024, more than 2,500 documents, which appeared to emanate from Google’s internal Content API Warehouse, were released on developer code sharing platform GitHub by an automated bot called yoshi-code-bot.

The documents made their way into the hands of SEO consultant Erfan Azimi. He shared the findings with Rand Fishkin, co-founder of audience intelligence tool SparkToro and founder and former CEO of the globally renowned SEO tool Moz. As Fishkin had not been at the front line of search for six years, he collaborated with Mike King, iPullRank CEO. Both reviewed and analysed the leaked information and shared their thoughts on their respective blogs.

What did Fishkin and King discover, and why does it matter?

In a nutshell, information has been uncovered around how Google Search has or is using clicks, links, content, entities, Chrome data and other elements to rank content.

This is important, as some of this information may form part of Google’s closely guarded search ranking algorithm, which it has always been intensely secretive about.

How did Google react to the leak?

At first, Google stated that many assumptions were being published in response to the leaked documents. They said they were out of context and based on incomplete information.

They wouldn’t comment about specific elements of the documents either, failing to confirm which were accurate, which were invalid and which are currently being used, and how they’re being used. There wasn’t even any notion as to whether the documents were authentic.

On 29th May, however, Google did confirm that the collection of documents is authentic. But that did not ease the widespread disgruntlement in the SEO community that had burgeoned in the 48 hours following the leak’s announcement.

The trouble was that the Google API documents showed clear details about ranking signals that Google had historically stated they did not use. And that’s why so many people felt they’d been lied to and misled by Google.

Why should we care?

The way Google decides how to rank a website has a huge impact on any business with a reliance on the web to drive sales. So, as you can imagine, if they’ve been saying one thing, but have been doing another, that’s not very helpful to anyone involved in managing SEO campaigns. And that’s putting it politely.

But not everyone in the SEO world feels that way. Some support Google. They could never bring themselves to believe that they’d be wronged by such a ‘trustworthy organisation’.

Between those two fences – the disgruntled and the believers – there sits the rational, ‘prove it for yourself’ camp. The ones of the opinion that Google sometimes tells the truth, but that it’s always best to do your own due diligence and test the theory.

And that’s pretty much where we’re sitting at this moment in time. We are ready to examine the theories, and see how we can draw tangible, actionable conclusions from them that will benefit our clients.

So, what next?

The Google API documents referred to more than 14,000 possible SEO ranking factors. Obviously we’re not going to unpick all of those in this post. That we’ll spare you of.

But what we can do is share some of the SEO ranking factors that have come out of this revelation as being just as important as, or more important than previously realised. And then we can draw some actionable conclusions.

So, here we go…

1. Links ARE important – but there’s more

Contrary to what we may have been led to believe, backlinks remain an important ranking signal. But there’s more… and it’s all about the right type of links.

The Google Search document leak has revealed that Google favours links from established sites with high domain authority, using non-spammy anchor text, and on popular, high-traffic pages.

Basically, links from a regularly updated news site that attracts many visitors each month are more beneficial, as opposed to links from a site that hasn’t been updated in years and gets few visitors, even if it does have a high domain authority.

New links basically trump old links. And one way to attract new links is by investing in digital PR, pulling in fresh links from regularly updated news sources.

Anchor text is also important. The clickable text of a link provides context about the linked page. Descriptive, relevant anchor text can enhance the relevance signals of a page and help it rank better for certain keywords.

However, excessive use of exact match keywords can result in penalties. Anchor text should therefore be naturally integrated within the content so that the reading experience remains seamless.

2. Google ‘site authority’ score does exist

‘Domain authority’ or ‘domain rating’ have traditionally been connected to specific SEO tools such as Moz, Semrush or Ahrefs to show the growth in authority or strength of a domain, mostly based on links.

Google has long had us believe that they don’t measure domain authority. But now we are led to question whether that’s the case.

Following the Google API document leak, it has come to light that Google does indeed have a feature named ‘siteAuthority’ which looks very much like an equivalent version of domain authority or domain rating.

Unfortunately, it’s not clear how Google calculates its site authority, or the extent to which it uses it within its ranking systems, if at all. But there are clues that it’s playing some sort of role in its quality signals.

So it looks like we should take this as authority matters. This returns us to the importance of backlink quality, and the value of fresh links coming from high quality news sites.

3. Content authors matter

Whilst there’s nothing specifically mentioned in the API leak about it and it’s never been confirmed as an SEO ranking factor, we all know the value of E-E-A-T as a quality signal, and it’s obviously important for creating user trust.

However, something we did learn was that Google tracks authors (known as ‘entities’) across the web and within a website.

There’s also a way for them to tell if an author is the same person across different websites, and cross referencing to verify whether the entity on a page is actually the content’s author.

Now, we know that an important aspect of E-A-A-T is article authorship. So what we’ve learnt from this aspect of the leak is the importance of making sure a site’s blog posts carry an author bio and author archive, preferably connected to a social media account or other relevant bio.

It’s also valuable to carry the author’s name into published news posts as the ‘expert quote’ to maintain consistency outside of the website.

4. Content dates are significant

The Google Search document leak has prompted a renewed look at the importance of paying attention to dates on published content.

The leak has shown that Google places significant value on fresh, up to date content.

Whilst there was no mention of any date-related demotions, maintaining consistency across the dates of published content across all of Google’s date measurements can be considered best practice.

This is easier to explain by way of an example:

You publish an article on your website, say for example, ‘2022 Fashion Trends to Watch’. Its URL contains the year 2022. But a few months later, you update the article and change its title to 2023. Then, 12 months later, you make some more amends, and add ‘last updated 2024’ into the content.

So now you have an article with a 2022 URL, a 2023 title, and 2024 in the content. Together, this mishmash of dates could harm the ability of the article to rank, because it is confusing Google as to the date your content was published.

Really, the best advice is to avoid including dates in article URLs at all, and ensure when refreshing content, all the dates are changed consistently.

5. User experience is top of the list

We’ve always known that user experience is important. After all, why would we want our website users to have anything other than a positive experience?

The leaked data shows that user interaction, especially click data, plays an important role in ranking. The ‘NavBoost’ system uses clickstream data to rank pages based on user behaviour, rewarding sites with a higher level of engagement.

The API documents specifically mention features such as ‘goodClicks’, ‘badClicks’, and ‘lastLongestClicks’. This is basically showing that Google is tracking how users make their way around a site, how long they spend on certain pages, where they navigate to next, etc.

With this in mind, optimising for NavBoost must also involve enhancing user satisfaction by adding content that quickly meets user intent and maintains interest by providing a unique take on things that stands out from every other page.

A strong introduction that keeps the user from clicking back to the Search Engine Results Pages (SERPs) in just a few seconds is imperative to show Google that your content is useful. Strong calls to action, a visually appealing design and plenty of interactivity will all help the cause, as will subheadings that introduce the answers to common user queries.

It also means creating enticing meta descriptions and titles that improve the click through rate from the SERPs. This will also reduce bounce rate and increase dwell time, letting Google know that the content is relevant and valuable.

Finally, we have for a long time known the importance of a well-designed website for user experience. But there is a clear case for making this an ongoing concern.

In other words, instead of designing and launching a website and leaving it as it is, make an effort to keep on improving it. Do this based on user journey feedback and analysis over time, and by keeping things aligned with the latest Google updates.

6. Chrome data is being used to inform rankings

The leaked Google documents reveal that data collected from the Chrome browser has a role to play in search rankings. Despite Google having previously said on more than one occasion that it does not use Chrome data for rankings, it is clear now that it is actually doing so in order to refine and improve its algorithms.

The documents reveal that Chrome tracks user interactions such as clicks, time spent on a page, and browsing patterns. This data helps Google understand user preferences and behaviour, which informs its decisions on ranking pages based on what users are actually doing in the real world.

This information demonstrates the importance of focusing on enhancing user experience by increasing page load times, improving navigation, and upping the ante with content engagement. As we’ve already explored, engaging content is the name of the game.

7. Fresh content is key

Another key takeaway from the Google Search document leak was that content that is not regularly updated has the lowest storage priority for Google, and will be unlikely to appear in the search results for fresh queries.

It’s therefore vital to update content regularly. Enhancing with fresh information and unique opinions and adding new images and videos is the name of the game.

The leak has also uncovered that Google maintains a record of every version of a web page, in effect creating an internal ‘web archive’. However, only the past 20 versions of a page are used. So, by updating a page and allowing it to be crawled, it is possible to push out older versions.

It has also become clear that any pages on a website that aren’t topically relevant should be removed or blocked, as should poorly performing pages. Look to cull pages with low user metrics, and that have failed to attract backlinks.

Site-wide scores are continuously referred to across the leaked documents, so it is just as effective to delete the weakest pages as it is to optimise new pages.

The Google Search document leak – what next for SEO?

The Google Search document leak whipped up something of a frenzy in the SEO world.

But whilst there have been a number of revelations in terms of contradictions between what Google has been telling us, and what is actually the case around its SEO ranking factors, much of the actionable advice really is what we’ve already been doing.

This is basically because so much of it is to do with creating the ultimate user experience, with content that engages the audience and addresses their queries, a clear navigation that guides people seamlessly to where they need to be, and a general, all-round positive, interactive experience.

The good news is that creating a positive user experience is what we’ve always been striving for anyway.

Figment is a London SEO agency with a proven track record of getting businesses found in Search. To discuss how we can help you improve your online visibility, you are welcome to get in touch.

7 Things We Learned From the Google Search Document Leak (2024)

FAQs

What are the key takeaways from Google leak? ›

The leaks confirm Google's use of metrics and signals they previously denied, such as user engagement and site-wide authority, undermining their credibility. These revelations indicate that Google may have been using the SEO community for its benefit rather than supporting genuine optimization efforts.

What does the huge Google search document leak reveal? ›

Massive Google document leak reveals secrets of search ranking algorithms. Internal Google documents leaked on GitHub reveal secret search engine algorithms, contradicting Google's statements. The 2,500-page 'Google API Content Warehouse' document provides SEO insights shared by Rand Fishkin and analysed by experts.

What is the Google leak reveals SEO secrets? ›

The Google algorithm leak confirms what many in the SEO community suspected: Google places a high premium on content quality and relevance. The documentation included detailed metrics assessing content depth and usefulness, reflecting Google's commitment to delivering valuable search results.

What was the Google leak? ›

The Google data leak exposes an elaborate framework with 2,596 modules with 14,014 attributes. Despite the extensive documentation, specifics on the weighting of these features remain a mystery, suggesting a highly complex and nuanced ranking system. Still, it highlights the importance of various elements for SEO.

Does Google leak your search history? ›

Google can access your search history, especially if you're signed in to your Google account. Internet service providers can see the domain names of the websites you visit. Some apps on your phone might ask permission to access your internet browsing history. If you grant it, they'll be able to view it.

What is the key proposition of Google search campaigns? ›

Targeted Reach

One of the key value propositions offered by Google Search campaigns is the ability to precisely target specific demographics, locations, and devices.

What are the ranking factors of Google leak? ›

The leaked Google Search API documents offer incredible insights into the intricate algorithms that determine search rankings. Factors such as site authority, click data, NavBoost, the sandbox effect, domain age, Chrome data, and various demotion signals are key to better search visibility.

Has Google leaked 2500 pages of documents? ›

Google has indirectly acknowledged the authenticity of 2,500 leaked internal documents detailing the data it collects. The document has created ripples in the SEO and publishing industries, as they could reveal the search data that Google uses to rank pages and websites.

Can Google Docs get leaked? ›

In an unprecedented event, a massive leak of internal Google documents has offered a rare glimpse into the intricate workings of Google's ranking algorithm. The revelations, analyzed by industry experts Rand Fishkin and Michael King, shed light on the multifaceted factors that influence search rankings.

What are secret words in SEO? ›

Hidden Text: The primary intention behind hidden text, especially in the context of black-hat SEO practices, is to deceive search engines. Webmasters would “hide” excessive keywords or irrelevant content to manipulate search rankings without these words being visible to site visitors.

How to crack SEO on Google? ›

14 SEO Hacks to Increase Search Rankings
  1. Identify Long-Tail Keywords.
  2. Place Keywords in Page Titles, Descriptions, and Headers.
  3. Optimize Content for Google Voice Search.
  4. Run an SEO Audit to Fix Technical Issues.
  5. Refresh Old or Outdated Content.
  6. Compress Images to Improve Page Speed.
  7. Use Alt Text to Describe Images.
Oct 23, 2023

What is the Google deceptive website warning? ›

“Deceptive site ahead” is a warning message by Google Chrome for sites it views as unsafe. Its appearance implies that Google has blocklisted a malicious website due to certain security concerns. The deceptive site warning is part of Chrome's security measures to combat frequent cyber attacks.

How does Google search work leak? ›

One of the most significant revelations from the Google search algorithm leak is that Google actually tracks and uses clicks as a ranking signal. This is done through a feature internally called Navboost. Navboost is one of Google's top-ranking signals. It is mentioned 84 times in the document.

What is Google warning? ›

We send you security alerts when we: Detect important actions in your account, like if someone signs in on a new device. Detect suspicious activity in your account, like if an unusual number of emails are sent. Block someone from taking an important action, like viewing stored passwords.

What are Google key goals? ›

Google's mission is to organize the world's information and make it universally accessible and useful. That's why Search makes it easy to discover a broad range of information from a wide variety of sources.

What is the key to Google's success? ›

In conclusion, setting clear goals, consistency, and persistence, and having a positive mindset are three key elements that can help you succeed in any area of your life. Remember to stay focused, stay motivated, and never give up on your dreams.

What is Google takeout overview? ›

Overview. Google Takeout is a tool developed by Google that allows you to export and download a copy of various types of data currently stored in your Google account, like Drive or Mail. You can store the data locally on your computer or upload the data to another third-party cloud storage service.

What are the passkeys from Google? ›

Passkeys are a simple and secure alternative to passwords. With a passkey, you can sign in to your Google Account with your fingerprint, face scan, or phone screen lock, like a PIN.

Top Articles
Latest Posts
Article information

Author: Manual Maggio

Last Updated:

Views: 5963

Rating: 4.9 / 5 (69 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Manual Maggio

Birthday: 1998-01-20

Address: 359 Kelvin Stream, Lake Eldonview, MT 33517-1242

Phone: +577037762465

Job: Product Hospitality Supervisor

Hobby: Gardening, Web surfing, Video gaming, Amateur radio, Flag Football, Reading, Table tennis

Introduction: My name is Manual Maggio, I am a thankful, tender, adventurous, delightful, fantastic, proud, graceful person who loves writing and wants to share my knowledge and understanding with you.