Question

I imported a table from sql using sqoop import using the command sqoop import. On doing a select count(*) from Hive, I'm getting the row count as

231743

But the actual SQL table has 231742 rows.

Why am I getting one row extra for this table?

I imported 2 other similar tables have large amounts of data and am getting the exact count. But this particular table gives me an extra row in hive. Why is that? :-o

PS: I included --hive-drop-import-delims with the sqoop import command

Thanks in advance :)

UPDATE: Seems like I have duplicate entries in the table. It got generated during the import. Anyone has any idea why? :)

Was it helpful?

Solution

Okay.. I've solved it.

In the sqoop import command, instead of using --table table-name, I used --query SELECT * FROM table-name WHERE $CONDITIONS. That fixed it.

Thanks for your comments.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top