Question

I am using SharePoint Server 2007 on Windows Server 2008. I am using Search Center to crawl web data source (i.e. crawl web page from other web sites). My question is related to crawled page counters displayed for the web data source log page of Search Center.

My question is, there are 3 crawl counters displayed, successful counter, fail counter and warning counter. For each counter value, will there be any duplication urls? For example, it is reported for web data source www.mysite.com, 1000 are crawled successfully, 10 failed, no warning. Does it mean there are 1000 distinct web pages stored in Search Center? I am not sure whether there are any duplicated Urls in the 1000 counted pages?

BTW: I have this confusion because I set daily incremental page crawl, for example, if http://www.mysite.com/1.html is crawlered both yesterday and today (both cases are successful crawl), will it be counted twice? Appreciate if anyone could provide some documents about what are the counters' meaning?

thanks in advance, George

Was it helpful?

Solution

If you crawl a regular website it is going to follow each of the links. It shouldn't duplicate pages, but it will see the reference to the home page for example many times. Ultimately you would determine the number of pages or items by looking at the Items in Index count not the number of items crawled.

Licensed under: CC-BY-SA with attribution
Not affiliated with sharepoint.stackexchange
scroll top