After tracking millions of URLs with Indexing Insight, we've uncovered significant gaps in what Google Search Console tells you about your page’s indexing status.
In this newsletter, I'll reveal the top 5 things most SEOs don't know about the Page Indexing report and how these hidden insights could be affecting your SEO analysis.
So, let's dive in.
The current definition of 'crawled - currently not indexed' is misleading.
If you search for 'crawled - currently not indexed', most articles define this status as Google having crawled the page but not yet chosen to index it. The definition provides no context that the page was historically indexed.
This comes directly from Google's documentation:
“The page was crawled by Google but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling” Source: Page Indexing Report, Google Search Documentation.
However, based on data from Indexing Insight, this definition is misleading.
When analysing 1.7 million pages in Indexing Insight, we found that 70-80% of pages with the 'crawled—currently not indexed' status have historically been indexed by Google.
This means when you see the 'crawled - currently not indexed' status for a page, you're not looking at pages waiting to be indexed. In fact, you're looking at pages Google has actively removed from its search results.
At Indexing Insight, we created a new indexing status to identify indexed pages that Google has actively removed.
This state is called: 'Crawled - previously indexed'.
At Indexing Insight, we've created a new 'crawled - previously indexed' report to identify indexed pages that are actively removed by Google's index.
Indexing Insight is the only tool that helps identify the indexed pages that are actively removed by Google's index.
For more information, read our 'crawled - previously indexed' support documentation.
For one client alone, we found nearly 130,000 pages with what should be called 'crawled - previously indexed' status. That's 13% of their monitored pages that Google has actively removed from serving in search results.
The 'crawled - previously indexed' report helped the client identify exactly which pages were being actively removed by Google.
The coverage state 'URL is unknown to Google' is even more misleading than you might think.
Google's official documentation states:
"If the label is URL is unknown to Google, it means that Google hasn't seen that URL before, so you should request that the page be indexed. Indexing typically takes a few days." Source: URL Inspection Tool, Search Console Help
But our data tells a different story. Many pages marked as 'URL is unknown to Google' in Search Console have actually been historically crawled and indexed.
Gary Illyes from Google confirmed this phenomenon on LinkedIn, explaining that Google's systems can "forget" URLs as they purge low-value pages from their index over time.
Our 190-day Indexing Rule research confirms that page URLs can move through indexing states as signals are collected over time.
The 190-day Indexing Rule is simple:
If a page is not crawled within 190 days the indexing state of a page can be actively forgotten.
As Google begins to forget a page exists, the indexing state can reverse.
Eventually, forgotten pages move to 'URL is Unknown to Google'.
At Indexing Insight, we have created a new index coverage state that tracks when Google's index has 'forgotten' a page URL.
This state is called: 'URL is Known to Google'.
At Indexing Insight, we've created a new 'URL is Known to Google' report to identify pages that are actively forgotten by Google's index.
For one website monitoring 1 million URLs, 16% of URLs were labelled 'URL is known to Google' and many of these had search performance data proving they were previously indexed.
The 'URL is Unknown to Google' is misreporting in the Page Indexing report.
Our analysis found that 94% of pages that should be labeled 'URL is unknown to Google' are instead grouped under the 'Discovered - currently not indexed' report in GSC.
This misreporting isn't a bug. It's by design.
When you inspect these URLs individually using the URL Inspection tool, they show the 'URL is unknown to Google' status, contradicting what the Page Indexing report shows.
Why does this matter?
Because our 190-day Indexing Rule research has found that pages with 'URL is unknown to Google' status have zero crawl priority in Google's systems.
When you can't see which pages have this status, you can't take appropriate action to address the underlying quality issues.
By grouping these URLs under 'Discovered - currently not indexed', GSC leads you to believe you have a discovery problem when you actually have a quality problem that's so severe Google has chosen to forget your content entirely.
At Indexing Insight, we don't group indexing states together.
Instead, we allow our customers to view separate reports for 'URL is Unknown to Google' and 'Discovered - currently not indexed'.
Our customers can now clearly see the true indexing state of their important pages without guessing whether Google has discovered, crawled, or forgotten them.
Google Search Console indexing data does not indicate crawl priority.
The Page Indexing report in Google Search Console does not help users connect the dots between crawl priority and the index coverage state.
Our research at Indexing Insight, validated by Google's Martin Splitt, has revealed that index coverage states directly indicate crawl priority.
The crawling, indexing and rendering process can be matched to indexing states:
Learn more by reading How Indexing States Indicate Crawl Priority.
The 190-Day Indexing Rule research identified that the different Search Console indexing states indicate crawl priority in Google's crawling architecture.
The different index coverage states can be mapped to how long it takes Googlebot to recrawl a page:
At Indexing Insight, we help customers combine indexing and crawl data.
At Indexing Insight, we've built Crawl Coverage reports that help customers identify the crawl priority of pages using indexing data.
We use the Days Since Last Crawl metric to help group pages into different time buckets and help our customers track this over time.
Our customers can literally use Google's crawl and indexing data to understand how often Googlebot crawls their pages.
Check our pricing page to see which plans include Crawl Coverage reports.
The URL Inspection Tool is the source of truth, not the Page Indexing report.
When you see conflicting data between the URL Inspection tool and the Page Indexing report, you should always trust the URL Inspection tool.
Here's why:
The Google Search Central team has confirmed that the URL Inspection Tool is the most authoritative source for indexing data and should be considered the source of truth when conflicts arise.
This means that for truly accurate indexing analysis, you need to inspect URLs individually or use the URL Inspection API, which is what Indexing Insight does to provide daily monitoring.
Learn more by reading URL Inspection Report vs Page Indexing Report.
At Indexing Insight, we use the URL Inspection API to pull indexing data.
This means our customers are pulling the Google indexing data for their important pages straight from Google's index. The data is taken from the source of truth.
Our real-time Google indexing data helps customers quickly identify and address issues. Each project receives daily alert emails, allowing you to monitor Google indexing directly from your inbox.
The hidden insights in Google Search Console completely changes how you approach SEO analysis:
These nuances become even more critical for sites with 100,000+ pages as the scale makes manual analysis through GSC virtually impossible.
The Page Indexing report in Google Search Console doesn't tell the full story about your website's indexing health.
By understanding the true meaning behind coverage states, the misreporting of 'URL is unknown to Google', and how index states indicate crawl priority, you can develop a more accurate picture of how Google views your content.
Importantly, when Google core updates roll out, they don't just impact rankings.
They actively cause Google's systems to reprioritize what is worth crawling and indexing, potentially removing large numbers of pages from the index entirely.