Find Indexing issues in Three Simple Steps
Indexing issues can seriously impact your results in Organic Search, these are problems on your website making Google’s crawling and indexing duties more difficult due to the need of investing more computing power to process your website information.
In this short guide we will detail how you can quickly spot indexing issues in three simple steps, which will let you have a clear picture of your website current indexing status compared with how it should be (how many pages are indexable VS how many pages are actually indexed)
The three mentioned steps refer to the three sources of information we are going to use to gather indexing data to then produce a comparison analysis.
- “Site:” Search Command
- Search Console Coverage Report
- Crawl Data
Let’s now run through each of these steps we need to complete in order to gather the information we need.
1. The “Site:” Search Command
The site: command is one of the numerous advanced search commands you can carry out to refine search results in Google Search. It will give you the number of pages included in the index. However, since the data is not live updated, this number won’t be 100% accurate. Despite this fact, it is still a very good indicator which will give you a number of indexed pages, very close to the actual one, and it shouldn’t differ much from what you will get from the next two sources.
In order to run this command, we just need to type “site:” into the search box including our domain name in it, for example site:example.com.
Now we have a very good approximation to the real number of indexed pages for our website. However, we additionally need to dig on Google’s Search Console Reports to get another version of this number to make our comparison.
2. Search Console Coverage Report
Google Search Console is the best source of information about your website performance in Google Search. If you haven’t yet joined this suite you should definitely do it now, it gives you very important information regarding your site’s health within Google Search and it’s free! In order to find out how many indexed pages this tool says we have, we just need to jump to the Coverage Section of the tool and check the number of Valid Pages (see below).
3. Crawl Data
Finally and in order to make the proper comparisons, we need information about how many of our pages are actually crawlable and indexable, we can possibly know how many pages we think should be indexed, but the ideal way to exactly know how many pages should be actually indexed is by running a crawl in our website.
There are many crawlers out there, but on this occasion, we will use the best-known one in the industry which is Screaming Frog. Screaming Frog is a crawler which lets you see your website like Google does, and will provide us with a list of those pages than can certainly be indexed. We are not going to explain how to set up the crawl with Screaming Frog this time. However, the documentation about the different ways you can configure your crawl can be found here.
From our crawl, we can get the number of pages that should be actually indexed, in Screaming Frog’s case we have this information right from the Success 2XX Section. Going to that section we can see how URLs are marked either as Indexable or Non-Indexable. Now we just need to click on export and this table will be exported as a spreadsheet document format in which we will be able to filter URLs which are marked as Indexable.
Filtering by indexable pages will give us the number of pages that should be hence indexed for our website.
Now we just need to create a table and compare the number of pages from our three sources of information. Luckily for both site owners, the first 2 pictures displayed belong to different websites, otherwise, the site would possibly be in serious trouble.
We can now compare our three data points, indexed pages from Search Results, indexed pages from Search Console and indexable pages from our Crawl. If the number of pages indexed (Search Results and Search Console) is close to the number of indexable pages (from the crawl), our site is probably in a good situation.
|Search Results||Search Console||Indexable from Crawl|
Table: comparing indexed pages VS indexable pages
Small differences like the one showed above, are commonly caused by the time needed for Google Results and Google Search Console to update the results displayed for your website, which means that your pages are most likely to be correctly indexed, but Google is not yet showing the latest data despite having it.
You can quickly check this by taking a sample of pages that should be either indexed or not, depending on what do you want to confirm, to then check their live status with Google Search Console URL Inspection Tool. The URL Inspection Tool allows checking the live version status of any URL as explained in detail within the official documentation here.
If the gap between these three data points is showing a considerable difference for pages that should be indexed versus pages actually indexed, your site could be suffering from indexing issues which can seriously harm your organic traffic from Google.
|Search Results||Search Console||Indexable from Crawl|
Table: Big differences between indexable and indexed means trouble
If unfortunately, once you collect all the data, the table for your website looks more like the one above, it very likely you the site has some issues and therefore a more in-depth analysis should be completed. This further analyse means that you need to carry on with a full Indexing Audit, which is one of the points usually covered within Technical SEO Audits. This Indexing Audit will give light to what the problems behind the difference between indexable and indexed pages are.
If you either need to run a crawl to make this comparison and have no the resource to do it, or if you have in fact run this process and found out that your website is possibly having indexing issues, and you need some help finding the causes, please feel free to contact us either using the contact form below or sending us a message to firstname.lastname@example.org