image of this article category

Googlebot crawls and indexes the first 15MB of HTML content

29.06.2022 03:52 AM
Latest
Googlebot crawls and indexes the first 15MB of HTML content
dooklik website logo
share
share this article on facebook
share this article on twitter
share this article on whatsapp
share this article on facebook messenger
Googlebot crawls and indexes the first 15MB of HTML content

Googlebot's help document update has a confirmation that it will crawl the first 15MB of the webpage and nothing after this cut will be included in the rating calculations.

Google specifies in the help document:

“Any resources referenced in HTML such as images, videos, CSS and JavaScript are fetched separately. After the first 15MB of the file, Googlebot stops crawling and considers only the first 15MB of the file for indexing. File size limit applies to uncompressed data.”

This left some in the SEO community wondering if this means that Googlebot will completely ignore text below images when cutting into HTML files.

"It's specific to the HTML file itself, as written," John Mueller, a Google search attorney, explained via Twitter.

"Resources/embedded content pulled using IMG tags are not part of the HTML file."

What does this mean for SEO?

To ensure it is weighted by Googlebot, important content must now be included near the top of web pages.

This means that the code should be structured in such a way that it puts SEO-relevant information with the first 15MB into a supported HTML or text file.

It also means that images and videos should be compressed so that they are not directly encoded in HTML, whenever possible.

Currently SEO best practices recommend keeping HTML pages to 100KB or less, many sites will not be affected by this change. Page size can be checked using a variety of tools, including Google Page Speed Insights.

In theory, it would seem disconcerting that you are likely to have content on a page that is not being used for indexing. But in practice, 15MB is a very large amount of HTML.

As Google states, resources such as images and videos are fetched separately. Based on Google's wording, this 15MB cut appears to apply to HTML only.

It will be hard to get past this limit with HTML unless you're posting whole books' worth of text on one page.

If you have pages of more than 15MB of HTML, you likely have underlying issues that need to be fixed anyway.

Related Articles
doolik website logo
It's impossible to overestimate the power of words in the huge ocean that is the internet, where innumerable websites fight for viewers' attention. By influencing opinions, arousing feelings, and motivating action, the text on your website acts as a conduit between your company and your target market. Vibrant website authoring is not only a luxury, but a need in this digital age when first impressions are made in milliseconds. We'll look at why writing compelling text is important and how it may improve your online visibility.
doolik website logo
As businesses strive to stay competitive, the choice of a reliable Content Management Service (CMS) becomes crucial. One option that stands out is dooklik, offering a range of features designed to streamline content processes and elevate the user experience. Here's why dooklik Content Management Servicing is gaining traction in the industry.

doolik website logo
YouTube has taken a significant step to enhance the visibility of concise educational videos elucidating fundamental first aid techniques for emergency and critical situations. The platform, in its commitment to facilitating access to vital information, announced on its official website a strategic initiative to streamline the discovery of instructive videos.
Live Video Streaming
Live video streaming lets you engage with your audience in real time with a video feed. Broadcast your daily show to your audience with no limits, no buffering and high quality videos. Reach all devices anytime anywhere with different video qualities that suits any device and any connection.
$1,120/YE*
The website uses cookies to improve your experience. We’ll assume you’re ok with this, but you can opt-out if you wish.
ACCEPT