Question

I currently store around 200 Million tweets for analytics related reasons. It is taking around 300 G of MySQL db. In future it is still going to grow continuously. I execute a lot of search queries and analytical queries on this data. Till now MySQL is performing as expected. In future I would like to scale horizontally and retain the existing full text query power and analytical query power. What are the options that I should look at (both relational/NoSQL)? I currently use solr for full text search.

Was it helpful?

Solution

Eventually (very soon) you're going to have too much data for a single server to process efficiently.

Cloudera has integrated Hadoop with Solr, to combine full text search with a distributed cluster of data servers in HDFS. This way you can continue to scale out by adding more servers.

http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/search.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top