Question

I understand that MarkLogic is designed for data wrapped in XML.

I loaded a txt file where the data is delimited by tabs and I have been trying to do a word search on the database where I loaded the txt file in any way possible.

I tried using the search:search XQuery function using the Query Console but the result only gives me the first occurrence of the search keyword. I think MarkLogic is thinking the whole txt file is wrapped in 1 XML tag.

I would like to be able to search this flat text file and get search results that would look similar to a Google search results page. Is this possible? How? Or does MarkLogic expect all data to be in XML format?

Was it helpful?

Solution

MarkLogic can manage content in XML, JSON, Binary, or Text.

How did you load the data? For a tab delimited .csv type file, i'd suggest loading with Content Pump http://docs.marklogic.com/guide/ingestion/content-pump#id_70366

A .csv file is usually an export for a relational table or excel. In which case, rows become individual documents in MarkLogic. From your description, it sounds like the document was loaded in its entirety, and not broken down into individual documents. This is easy enough to verify in query console as if you click the 'explore' button, you should see multiple URIs. If you see only one for the document you loaded, then you know the doc was loaded in its entirety as a single document, and that's why you only get one result for the search.

Yes, you can get Google style search results with MarkLogic. You may want to take a look at AppBuilder, as it will generate a search app for you quickly with Google style results and it provides a Google style grammar for searching. If you want to roll your own, check out snippeting in the REST API.

http://docs.marklogic.com/guide/app-builder/intro#chapter

http://docs.marklogic.com/guide/rest-dev/search#id_83997

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top