Question

Lucene can work with disturbed infinispan cache. I'm wondering what is the moment to move from lucene + infinispan to Katta based on hadoop? When I'll be more effective to use Katta and when Lucene+infinispan? I've read that hadoop is not suitable for real-time systems, but what with Katta?

Was it helpful?

Solution

What are your requirements? I would estimate that 99% of people on SO who ask for ultra-scalable Lucene find that Solr (or even out of the box Lucene) more than meets their needs.

If you are one of the rare people having thousands of queries per second over petabytes of data, LinkedIn uses a Lucene+Hadoop based solution (zoie) for their realtime searches.

I'm not sure where you read that Hadoop "isn't suitable for real-time systems" - no doubt there are certain systems in which its framework isn't ideal, but there are tons of real-time apps running on Hadoop.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top