Question

I have 700.000 street names, 8111 municipality names, and 80333 locality postcodes. I would like to index all this information in solr. The user wants to search this information through an ajax autocomplete form. I have proved it with few data and the behavoir of the ajax autocomplete form it's ok.

 <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
  </analyzer>
</fieldType>

The problem happens when loading all the data into solr

  • How should I load the information to the solr server (I'm in a grails app and I need to load instances that have the information without data input handler) Today I have been many hours today to do it and finally grails console crashed :( --> should I use a grails script instead of doing a service and executing it with grails console??
  • Or should I use data input handler to load it faster?? Can I concat string values from diferent columns of different tables with data input handler??

(It's okay to have a different document for each one (700.000 + 8111 + 80.333 documents) ??)

thanks for your time

Was it helpful?

Solution

I assume your municipalities, street names, and post codes are supposed to be autocompleted separately. In this case you'd use a separate solr core for each one.

Or should I use data input handler to load it faster??

DIH will be pretty fast, and as long as this information doesn't change very often, it should be fine to do it this way.

Can I concat string values from diferent columns of different tables with data input handler??

Yes; in data-config.xml you give specific SQL query and can use the database's native concatenation (e.g. || in oracle).

OTHER TIPS

Seriously, write a shell script and use curl to send the updates to SOLR.

You are trying to shoot cans off the wall with a cannon mounted on a ship floating in your swimming pool. You don't need a cannon or a ship or a pool. Just stand there with an air gun and pop the updates off one by one until done.

For an examlple shell script complete with sample SOLR updates, download the SOLR binary, either apache-solr-3.5.0.tgz or apache-solr-3.5.0.zip from a mirror near you. Find the mirror at http://lucene.apache.org/solr/downloads.html

Unpack the archive, go into the example directory and follow these instructions http://lucene.apache.org/solr/tutorial.html

If you are on UNIX, just use post.sh.

By the way, check the SOLR version that you have installed on your server. If it isn't 3.50 then why are you using an old version when you have the newer one right here, right now?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top