image of this article category

Googlebot crawls and indexes the first 15MB of HTML content

29.06.2022 03:52 AM
Latest
Googlebot crawls and indexes the first 15MB of HTML content
dooklik website logo
share
share this article on facebook
share this article on twitter
share this article on whatsapp
share this article on facebook messenger
Googlebot crawls and indexes the first 15MB of HTML content

Googlebot's help document update has a confirmation that it will crawl the first 15MB of the webpage and nothing after this cut will be included in the rating calculations.

Google specifies in the help document:

“Any resources referenced in HTML such as images, videos, CSS and JavaScript are fetched separately. After the first 15MB of the file, Googlebot stops crawling and considers only the first 15MB of the file for indexing. File size limit applies to uncompressed data.”

This left some in the SEO community wondering if this means that Googlebot will completely ignore text below images when cutting into HTML files.

"It's specific to the HTML file itself, as written," John Mueller, a Google search attorney, explained via Twitter.

"Resources/embedded content pulled using IMG tags are not part of the HTML file."

What does this mean for SEO?

To ensure it is weighted by Googlebot, important content must now be included near the top of web pages.

This means that the code should be structured in such a way that it puts SEO-relevant information with the first 15MB into a supported HTML or text file.

It also means that images and videos should be compressed so that they are not directly encoded in HTML, whenever possible.

Currently SEO best practices recommend keeping HTML pages to 100KB or less, many sites will not be affected by this change. Page size can be checked using a variety of tools, including Google Page Speed Insights.

In theory, it would seem disconcerting that you are likely to have content on a page that is not being used for indexing. But in practice, 15MB is a very large amount of HTML.

As Google states, resources such as images and videos are fetched separately. Based on Google's wording, this 15MB cut appears to apply to HTML only.

It will be hard to get past this limit with HTML unless you're posting whole books' worth of text on one page.

If you have pages of more than 15MB of HTML, you likely have underlying issues that need to be fixed anyway.

Related Articles
doolik website logo
Our planet's largest active volcano has awakened for the first time in four decades, turning the sky red.

doolik website logo
Some Internet sites reported that those in charge of "YouTube" are developing site services, to enable users to earn more material profits.

doolik website logo
E-commerce giant Amazon has become the first company in the world to lose $1 trillion in market value, amid concerns about its future.

Live Video Streaming
Live video streaming lets you engage with your audience in real time with a video feed. Broadcast your daily show to your audience with no limits, no buffering and high quality videos. Reach all devices anytime anywhere with different video qualities that suits any device and any connection.
$1,120/YE*
The website uses cookies to improve your experience. We’ll assume you’re ok with this, but you can opt-out if you wish.
ACCEPT