Question

I am using cloudera impala where i have started statestore on one machine and imapalad instance on all the machine node. how does each instance of imapalad and statestore knows about all other impalad instance for distributing the load when querying. I have set the configuration but no where i am specifying the list of data node for them to kno.

thanks.

Was it helpful?

Solution

Statestore(running as statestore) does the following:

  1. Checks the health of all nodes in the cluster, and continuously relays the information to all demons(impalad)
  2. Only one statestore is enough.
  3. If no statestore is avaliable the demons still work, but the cluster becomes less robust.
  4. When the statestore comes back online, it re-establishes the connection with the other nodes and resumes its monitoring function.

OTHER TIPS

For infromation on how dose Impala works see this

If you want to balance load among the impala instances you can do that by introducing load balancer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top