I loaded the data into a table with all columns defined as string
and then casted the date value and loaded into another table where the column was defined as DATE
. It seems to be working without any issues. The only difference is that I am using a Shark version of Hive, and to be honest with you, I am not sure whether there are any profound differences with actual Hive and Shark Hive.
Data:
hduser2@ws-25:~$ more test.txt
2010-01-05 17:51 Visakh
2013-02-16 09:31 Nair
Code:
[localhost:12345] shark> create table test_time(dt string, tm string, nm string) row format delimited fields terminated by '\t' stored as textfile;
Time taken (including network latency): 0.089 seconds
[localhost:12345] shark> describe test_time;
dt string
tm string
nm string
Time taken (including network latency): 0.06 seconds
[localhost:12345] shark> load data local inpath '/home/hduser2/test.txt' overwrite into table test_time;
Time taken (including network latency): 0.124 seconds
[localhost:12345] shark> select * from test_time;
2010-01-05 17:51 Visakh
2013-02-16 09:31 Nair
Time taken (including network latency): 0.397 seconds
[localhost:12345] shark> select cast(dt as date) from test_time;
2010-01-05
2013-02-16
Time taken (including network latency): 0.399 seconds
[localhost:12345] shark> create table test_date as select cast(dt as date) from test_time;
Time taken (including network latency): 0.71 seconds
[localhost:12345] shark> select * from test_date;
2010-01-05
2013-02-16
Time taken (including network latency): 0.366 seconds
[localhost:12345] shark>
If you are using TIMESTAMP
, then you could try something in the lines of concatenating the date and time strings and then casting them.
create table test_1 as select cast(concat(dt,' ', tm,':00') as string) as ts from test_time;
select cast(ts as timestamp) from test_1;