Question

We are exploring options to archive data in warehouse or RDMS to Hadoop.

As matter of fact I have to use sqoop to load data in to HDFS and probably have to compress it. Then delete the rows which are to be archived.

Trouble is when I have foreign key relation between two tables. I need to maintain data consistency between tables. Please help me with approach.

Was it helpful?

Solution

Luckyly I could find a solution for this using Sqoop API. I was triggering a join query to select the data from child table first then from parent tables. I had writen all the logic in a java program using Sqoop API.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top