Add PatternReplaceFilterFactory
in your analyzer chain after ShingleFilterFactory
. Replace all Token containing filler token with empty string i.e. "".
This may solve your problem temporarily but for permanent solution have to write your own analyzer or customize ShingleFilter.
Sample FieldType:
<fieldType name="text_general_shingle" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="3" outputUnigrams="true"/>
<filter class="solr.PatternReplaceFilterFactory" pattern=".*_.*" replacement=""/>
</analyzer>
</fieldType>