Question

I want to test and configure Impala with my Hadoop 2.2.0 distribution, not Cloudera ones.

I want to know if its possible to use Impala without CDH, because I only read that Impala is CDH dependent.

I'm trying to follow the guide in Impala Github - https://github.com/cloudera/impala - and I'll do the possible changes to make it work.

Does anyone already done that? or is it really impossible?

Was it helpful?

Solution

I think there are two things here that should be addressed separately:

  1. Running Impala on non-CDH Hadoop. It is possible, though it is not tested or supported by Cloudera. However, other Hadoop distributions include Impala, e.g. MapR's distribution includes Cloudera Impala and Amazon announced support for Impala on Elastic MapReduce, and they have both tested that it works with their distributions. I assume you're not using MapR, either, but my point is just that it is possible.
  2. Running Impala on Hadoop 2.2.0. This is also possible as the CDH5 beta 1 release includes Hadoop 2.2.0, so Impala versions 1.2 and higher should work. Please do make sure you use the latest version (1.2.3 as of now) because there are a number of important fixes in the last few minor dot releases.

So yeah, it's possible, though it probably won't be a smooth installation and there isn't a lot of help for this use case. Good luck!

OTHER TIPS

as far as i know, you should use CDH4, other versions of hadoop have the problem of compatibility. ps: CDH3 is not available for running impala.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top