Multi-Threaded Geo Web Crawler In Java
DZone
APRIL 18, 2019
This article provides the implementation of a web crawling system called Mowglee that uses geography as the main classifying criteria for crawling. Also, it runs in a multi-threaded mode that provides a default implementation of the robots exclusion protocol, sitemap generation, data classifiers, data analyzers, and a general framework for application to be built of a web crawler.
Let's personalize your content