Domanda

I have a simple HTML documentation site with a TOC column on the left side that includes all the links on the page. The URLs are structured http://documentation/section1_subsection1_subsection2.htm

When I set Sharepoint to crawl the site, only 41 pages are indexed (out of 5000+). I don't get any crawl errors, it's like Sharepoint doesn't see all the other pages, even though they're listed as html links right in the page.

Sharepoint seems to be stopping halfway through http://documentation/section2_subsection3, so it's seeing some of the links, but not others..

Content source settings:

start address: `http://documentation`
only crawl within the server of each start address
incremental crawl: every night at 6pm
full crawl: every Sunday

I did set a crawl rule for http://documentation/*, but that didn't make a difference.

there is no noindex set on the site.

Another fun part of this is that I recently had to recreate the search service application - Before I recreated it, all of the content in the source was crawled.

È stato utile?

Soluzione

I realized that the site links are loaded when the link is clicked - There's an XML sitemap file, so I'm working on figuring out how to set up a BCS connector.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a sharepoint.stackexchange
scroll top