Statistics of Common Crawl Monthly Archives

Number of pages, distribution of top-level domains, crawl overlaps, etc. - basic metrics about Common Crawl Monthly Crawl Archives

View the Project on GitHub

Crawler-Related Metrics

Crawler-related metrics are extracted from the crawler log files, cf. ../stats/crawler/ and include

The first plot shows absolute number for the metrics.

Crawler metrics

The relative portion of the fetch status is shown in the second graphics.

Percentage of fetch status