Pergunta

In our SharePoint solution we need a service which can search and index XML files.

At this moment we are using FAST, but are there other tools / applications / solutions which are free and easy to integrate / customize within SharePoint 2007 ?

Foi útil?

Solução

XML files are searched and indexed natively in SharePoint 2010 using the out-of-the-box engine (FAST is just a search add-on/alternative but uses the native engine under the covers so it's a supplement and will search XML files using the native engine).

It's a little confusing because you say you're using FAST (generally a product related to SharePoint 2010) but ask for tools for SharePoint 2007. I guess you can use FAST to crawl 2007 sources, I've just never tried.

In SharePoint 2010 and 2007 XML files are natively indexed and searched however they only search content, not markup. For example given the following snippet:

<RenderPattern Name="EditPattern">

The engine will only return a hit if you enter EditPattern, not RenderPattern. It will also search values in a tag so this snippet:

<SomeTag>SomeValue</SomeTag>

Will yeild a search hit if you're looking for SomeValue but no hits if you look for SomeTag. It will also search content contained inside of CDATA[] enclosures as long as it's enclosed in quotes so this snippet:

<HTML><![CDATA[<input TYPE=HIDDEN NAME="owsfileref">]]</HTML

Will return a hit if you search for owsfileref (but not CDATA).

Like I said, XML files are natively crawled. You can install the Microsoft iFilter pack but all it gives you is additional Office 2007 formats and Visio file formats.

There are additional 3rd party iFilters available. You can check out IFilter dot org for some but most companies have not produced 2010 versions and I'm not sure if you're looking for 2007 or 2010. 2007 there are still iFilters out there but few are being supported these days and even fewer are being updated for 2010.

Outras dicas

Pretty sure there is an XML iFilter included with SharePoint Server 2010. Not sure how deep it goes into each document.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a sharepoint.stackexchange
scroll top