Question

If I have multiple similar tables, e.g:

table A: "users", columns: user_name, user_id, user_address, etc etc

table B: "customers" columns: customer_name, customer_id, customer_address, etc etc

table C: "employee" columns: employee_name, employee_id, employe_address, etc etc

Is it possible that using Sqoop to import the three tables into one HBase or Hive table? So After the import, I have one HBase table contains all the records in table A, B, C ?

Was it helpful?

Solution

It's definitely possible if the tables are somehow related. A free-form query can be used in Sqoop to do exactly that. In this case, the free-form query would be a join. For example, when importing into Hive:

sqoop import --connect jdbc:mysql:///mydb --username hue --password hue --query "SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS" --split-by oozie_job.id --target-dir "/tmp/hue" --hive-import --hive-table hive-table

Similarly, for Hbase:

sqoop import --connect jdbc:mysql:///mydb --username hue --password hue --query "SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS" --split-by oozie_job.id --hbase-table hue --column-family c1

The key ingredient in all of this is the SQL statement being provided:

SELECT * FROM users JOIN customers ON users.id=customers.user_id JOIN employee ON users.id = employee.user_id WHERE \$CONDITIONS

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top