Solr/Carrot2 Integration

https://stackoverflow.com/questions/20560016

01-09-2022
|

Question

SOlr/Carrot2 Integration

i have multiple text files for each i created XML to index document on Solr as bellow

<add>
  <doc>
    <person>data </person>
    <organization>data here </organization>
    <content>Some spanish text here</content >
  </doc>
<add>

Schema used in Indexing

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />    
<field name="person" type="string"  indexed="true" stored="true" required="true" multiValued="true" />
<field name="orgnization" type="string" indexed="true" stored="true" required="true" multiValued="true"   />
<field name="content" type="text_es" indexed="true" stored="true" multiValued="true"/>  
<field name="location" type="string"  indexed="true" stored="true" required="true" multiValued="true" />

Now i am trying to integrate carrot2 clustering ,for that i followed this link http://carrot2.github.io/solr-integration-strategies/carrot2-3.8.0/index.html

My Problem is as a result of cluster query i am getting only one cluster as bellow

<arr name="clusters">
  <lst>
<arr name="labels">
  <str>Other Topics</str>
    </arr>
    <double name="score">0.0</double>
    <bool name="other-topics">true</bool>
    <arr name="docs">
      <str>#.txt</str>
      <str>abci-britanicos-pizzerias-201312120250.txt</str>
      <str>abci-arqueologos-israelis-descubren-primer-201312111303.txt</str>
      <str>abci-autoridad-fiscal-pensiones-201312111956.txt</str>
      <str>abci-buenas-razones-para-cambiar-201312110933.txt</str>
      <str>abci-audio-asamblea-aserpinto-201312112139.txt</str>
      <
    </arr>
  </lst>
  </arr>

i should get more cluster My corpus contain 60 text documents

Solution

In order for search results clustering to work in Solr, the title and content fields you pass for clustering must be stored. The declaration in Solr schema could look like this:

<field name="content" type="text" indexed="true" stored="true" />

OTHER TIPS

In addition to what Stanislaw said about fields being stored, please provide the query you used for clustering and, ideally, the full schema used to index your data.

If you have a mere 60 documents in your index and the query matches a small subset of documents then there will be nothing to cluster on.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow