crawler4j to crawl a list of urls without crawling entire website
-
25-06-2021 - |
Domanda
I have a list of web URLS need to be crawl. Is that possible to crawl only the list of webpage s with out crawling it deep. If i add the url as seed it crawls full website with full depth.
Soluzione
To only crawl the pages which you added as a seed, set the MaxDepthOfCrawling to 0.
CrawlConfig config = new CrawlConfig();
config.setMaxDepthOfCrawling(0);
PageFetcher pageFetcher = new PageFetcher(config);
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow