Setting maximum string length in ExtractingRequestHandler ("Solr Cell") .. setMaxStringLength()

StackOverflow https://stackoverflow.com/questions/16724201

  •  30-05-2022
  •  | 
  •  

Question

I'm using Solr and ExtractingRequestHandler to index documents but I do not know how to do the equivalent of Tika setMaxStringLength().

It appears to be indexing all of the smaller documents but not all of the text of a large document, which might imply that it's not setting tika.setMaxStringLength(-1)

Is it possible to set the value in solrconfig.xml? Is it possible to pass the value along with other parameters when posting using curl?

Was it helpful?

Solution

Check the Solr Config file for the limit

<maxFieldLength>10000</maxFieldLength>

This would limit the field length, which might be causing issues for you.

Which version of solr are you using, as it might have been deprecated.

IndexConfig in SolrConfig

The maxFieldLength parameter was removed in Solr 4. If restricting the length of fields is important to you, you can get similar behavior with the LimitTokenCountFactory, which can be defined for the fields you'd like to limit. For example, would limit the field to 10,000 characters.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top