image of this article category

Googlebot crawls and indexes the first 15MB of HTML content

29.06.2022 03:52 AM
Googlebot crawls and indexes the first 15MB of HTML content
dooklik website logo
share this article on facebook
share this article on twitter
share this article on whatsapp
share this article on facebook messenger
Googlebot crawls and indexes the first 15MB of HTML content

Googlebot's help document update has a confirmation that it will crawl the first 15MB of the webpage and nothing after this cut will be included in the rating calculations.

Google specifies in the help document:

“Any resources referenced in HTML such as images, videos, CSS and JavaScript are fetched separately. After the first 15MB of the file, Googlebot stops crawling and considers only the first 15MB of the file for indexing. File size limit applies to uncompressed data.”

This left some in the SEO community wondering if this means that Googlebot will completely ignore text below images when cutting into HTML files.

"It's specific to the HTML file itself, as written," John Mueller, a Google search attorney, explained via Twitter.

"Resources/embedded content pulled using IMG tags are not part of the HTML file."

What does this mean for SEO?

To ensure it is weighted by Googlebot, important content must now be included near the top of web pages.

This means that the code should be structured in such a way that it puts SEO-relevant information with the first 15MB into a supported HTML or text file.

It also means that images and videos should be compressed so that they are not directly encoded in HTML, whenever possible.

Currently SEO best practices recommend keeping HTML pages to 100KB or less, many sites will not be affected by this change. Page size can be checked using a variety of tools, including Google Page Speed Insights.

In theory, it would seem disconcerting that you are likely to have content on a page that is not being used for indexing. But in practice, 15MB is a very large amount of HTML.

As Google states, resources such as images and videos are fetched separately. Based on Google's wording, this 15MB cut appears to apply to HTML only.

It will be hard to get past this limit with HTML unless you're posting whole books' worth of text on one page.

If you have pages of more than 15MB of HTML, you likely have underlying issues that need to be fixed anyway.

Related Articles
doolik website logo
As businesses strive to stay competitive, the choice of a reliable Content Management Service (CMS) becomes crucial. One option that stands out is dooklik, offering a range of features designed to streamline content processes and elevate the user experience. Here's why dooklik Content Management Servicing is gaining traction in the industry.

doolik website logo
YouTube has taken a significant step to enhance the visibility of concise educational videos elucidating fundamental first aid techniques for emergency and critical situations. The platform, in its commitment to facilitating access to vital information, announced on its official website a strategic initiative to streamline the discovery of instructive videos.
doolik website logo
MSI, the leading computer manufacturer, announced the launch of its new gaming device, called “Claw,” amid increasing competition in the portable gaming device market, especially after the great success achieved by the “Steam Deck” device. From "Valve".
Live Video Streaming
Live video streaming lets you engage with your audience in real time with a video feed. Broadcast your daily show to your audience with no limits, no buffering and high quality videos. Reach all devices anytime anywhere with different video qualities that suits any device and any connection.
The website uses cookies to improve your experience. We’ll assume you’re ok with this, but you can opt-out if you wish.