Ever wondered why some of your important pages aren't indexed in Google?
Despite submitting your URLs through XML sitemaps and following best practices, many pages still end up in the dreaded "Not Indexed" category in Google Search Console.
There are 3 common types of Not Indexed pages every SEO professional should know about. In this article, we'll identify which category your pages fall into.
So, let's dive in.
The Page Indexing report provides insights into why pages aren't indexed.
Although this can be very useful, it's important we group these three Not Indexed reports into 3 categories. As it can help us clarify exactly what action we need to take to fix these indexing issues.
The 3 categories of Not Indexed pages are:
The pages in this category don't meet Google's technical requirements.
What indexing states are in this category?
These pages either don't meet Google's basic technical requirements or have directives that explicitly tell Google not to index them:
Why are pages grouped into this category?
Google detected that the page does not meet the minimum technical requirements.
For a page to be eligible to be indexed it must meet the following technical requirements:
If we group the technical errors in Google Search Console, they correspond with one of the minimum requirements:
These are pages which contain duplicate or similar content.
These types of errors are to do with Google canonicalization process in the indexing pipeline (I’ve provided descriptions as these are a bit more complicated):
Pages are grouped into this category because of Google’s canonicalization algorithm.
When Google identifies duplicate pages across your website it:
This process is called canonicalization. However, the process isn't static.
Google continuously evaluates the canonical signals to determine which URL should be the canonical URL for the cluster. It looks at:
If a page was previously the canonical URL but new signals make Google select another URL in the cluster, then your original page gets removed from search results.
The final category is about pages being actively removed by Google.
These types of indexing errors are split into 3 groups based on the signals collects around pages over time:
Google is actively removing these pages from its search results and index.
Our research has found that nearly 80% of the 'crawled - currently indexed' index coverage state were historically crawled and indexed.
But that's not all.
We did research into how Google manages its index highlights a mechanism that uses page quality to decide if pages are removed from search results.
But that's not all.
We researched and found that index coverage states indicate crawl priority in Google's architecture. And that this crawl priority is based on page quality.
But that's not all.
Our 130-Day Indexing Rule research has identified that Google actively removes pages from its search results. If an indexed page has not been recrawled in 130 days then it has a 99% chance that it will be changed to Not Indexed.
But that's not all.
Our 190-Day Indexing Rule research revealed a critical discovery: Google doesn't merely remove pages from its index, it actively forgets them. When a page goes 190 days without being recrawled, Google purges it from memory entirely, marking it as 'URL is unknown to Google'.
All of this research points to the same thing: The 3 index coverage states are a strong indicator of page quality and crawl priority.
If your website has a high proportion of indexed pages, this signals that you've exceeded Google's page quality benchmark.
However, if your website has a high number of 'crawled - currently not indexed' pages then this indicates a low-quality website. And Google is actively removing pages from its Google search results.
Finally, if you've got a lot of pages in the 'URL is Unknown to Google' report then it's a strong indication these pages have zero crawl priority.
3 categories of Not Indexed pages: technical, duplicate, and quality.
Technical barriers and duplicate content issues are generally within your control to fix through standard optimisation practices.
Quality issues, however, require deeper analysis and often signal more significant problems with how your content meets user and search engine expectations.
Regularly monitoring your indexation status is crucial to identifying which category your Not Indexed pages fall into and taking appropriate action.