Question

I'm thinking about possible solution (tool) for my issue. There is a collection of locations with a huge amount (more than 600 000) of elements. Locations have name (in different languages) and represented in tree structure: region->country->admin division->city->zip. User can add custom location, but I plan that these actions will happen rarely. Application should provide efficient ability to perform search by location name, type, to build hierarchical name (f.e. "London->England->United Kingdom"), build subtree of locations (f.e. all countries and cities in those countries of Europe).

I've considered three solutions.

  1. Plain database: locations will hold in some tables and the main building logic will be implemented in java code. In case of this solution I am worried about performance, because search, building tree and creating custom locations can involve additional table joining.

  2. SOLR: at first glance this task is exactly for solr: data set changes rarely, we need search by names. But I'm worried if Solr pivots feature will satisfy the tree building needs. Also I'm not sure if Solr searching will be much better then plain DB, because search is not so difficult (just searching by names which are short strings).

  3. graph db Neo4j: it seems useful for building trees and subtrees. But I'm not sure about search performance (it seems I should use community edition, which does not have some useful performance features like caching and etc.)

Was it helpful?

Solution

Database is a big NO. as RDBMS is not optimized for relation based queries. For example show me the people who are eating in the same restaurant where I do and also belong to the same region where I do. OR to make it more complex, a db query can be a killer where level of relations are to be calculated. Like I can be your second level friend where one or more of your friends is/are my friend(s).

SOLR: Solr is a good option but you have to see the performance impact of it. With so many rows to index it can be a memory killer. Go through these first before implementing SOLR. http://wiki.apache.org/solr/SolrPerformanceProblems

http://wiki.apache.org/solr/SolrPerformanceFactors

SOLR also not a good solution for more logical searches as you have to learn it all before going for it.

Neo4J (or Any other graph DB) is perfect solution. I have implemented all these three technologies myself and with my experience I found Neo4J best for such requirement.

However, you must see how to backup the database and how to recover it in case of a crash.

All the best.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top