Question

On a cloudera cluster: I tried as a test to import a 100k line mysql table in HBase with success. Now I'm trying to import a 264M line postgre table. At first I let the default number of mappers (no "-m" options) but, after 35 or 100 lines the sqoop job's console gives me this error :"org.apache.hadoop.client.RetriesExhaustedWithDetails:Failed XXX actions: servers with issues: XXXXXXXXXXXXX" on perfectly running machines and all my services are green on Cloudera. To avoid that I tried to set a single mapper ("-m 1"). No more errors, but after 100-110 lines, sqoop stops to add new lines and there is nothing in the cloudera's log... I tried to let the import run for 3 days in a row. Some infos: I let the sqoop chose the row id. The original table is 3 column-keyed (with varchars). I think I have some misconfigurations but I don't know which one...

Was it helpful?

Solution

Ok I figured it out. Hbase take a default single key base on one of the 3 primary key of the table. The result is : it re-write on the previous line with the same first primary key. It doesn't freeze. I have now specified which columns he must use to store in HBase.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top