Question

I have a single linux(ubuntu) server in development environment and I plan to use a single server for production environment as well.

I have crawl data generated from Nutch 2.2.1 that I would like to store in HBase 0.90.6. Since, I don't intend to use multiple machines, (all I have is a single server) which mode of HBase is ideal for production env in my case - pseudo or fully-distributed?

Was it helpful?

Solution

Pseudo distributed mode would be better as in Standalone mode local FS is used. This means you cannot take advantage of the parallelism provided by the HDFS+MR combo.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top