Question

I have created table in HIVE(0.10.0) using values :

2012-01-11  17:51   Stockton    Children's Clothing     168.68  Cash
2012-01-11  17:51   Tampa       Health and Beauty       441.08  Amex
............

Here date and time are tab separated values and I need to work on date column, Since Hive doesn't allow "date" datatype ,I have used "TIMESTAMP" for first date column(2012-01-11,...), however after creating table it is showing NULL values for first column.

How to solve this? Please guide.

Was it helpful?

Solution

I loaded the data into a table with all columns defined as string and then casted the date value and loaded into another table where the column was defined as DATE. It seems to be working without any issues. The only difference is that I am using a Shark version of Hive, and to be honest with you, I am not sure whether there are any profound differences with actual Hive and Shark Hive.

Data:

hduser2@ws-25:~$ more test.txt 
2010-01-05  17:51   Visakh
2013-02-16  09:31   Nair

Code:

[localhost:12345] shark>  create table test_time(dt string, tm string, nm string) row format delimited fields terminated by '\t' stored as textfile;
Time taken (including network latency): 0.089 seconds
[localhost:12345] shark> describe test_time;
dt  string  
tm  string  
nm  string  
Time taken (including network latency): 0.06 seconds
[localhost:12345] shark> load data local inpath '/home/hduser2/test.txt' overwrite into table test_time;                                                   
Time taken (including network latency): 0.124 seconds
[localhost:12345] shark> select * from test_time;
2010-01-05  17:51   Visakh
2013-02-16  09:31   Nair
Time taken (including network latency): 0.397 seconds
[localhost:12345] shark> select cast(dt as date) from test_time;
2010-01-05
2013-02-16
Time taken (including network latency): 0.399 seconds
[localhost:12345] shark> create table test_date as select cast(dt as date) from test_time;
Time taken (including network latency): 0.71 seconds
[localhost:12345] shark> select * from test_date;
2010-01-05
2013-02-16
Time taken (including network latency): 0.366 seconds
[localhost:12345] shark> 

If you are using TIMESTAMP, then you could try something in the lines of concatenating the date and time strings and then casting them.

create table test_1 as select cast(concat(dt,' ', tm,':00') as string) as ts from test_time;

select cast(ts as timestamp) from test_1;

OTHER TIPS

It works fine for me by using load command from beeline side.

Data:

[root@hostname workspace]# more timedata 
buy,1977-03-12 06:30:23
sell,1989-05-23 07:23:12

creating table statement:

create table mytime(id string ,t timestamp) row format delimited fields terminated by ',';

And loading data statement:

load data local inpath '/root/workspace/timedata' overwrite into table mytime;

Table structure:

describe mytime;      
+-----------+------------+----------+--+
| col_name  | data_type  | comment  |
+-----------+------------+----------+--+
| id        | string     |          |
| t         | timestamp  |          |
+-----------+------------+----------+--+

result of querying:

select * from mytime;                                                                     
+------------+------------------------+--+
| mytime.id  |       mytime.t        |
+------------+------------------------+--+
| buy        | 1977-03-12 06:30:23.0  |
| sell       | 1989-05-23 07:23:12.0  |
+------------+------------------------+--+

Apache Hive Data Types are very important for query language and data modeling (representation of the data structures in a table for a company’s database). It is necessary to know about the data types and its usage to defining the table column types. There are mainly two types of Apache Hive Data Types. They are, Primitive Data types Complex Data types Will discuss about Complex data types, Complex Data types further classified into four types. They are explained below,

2.1 ARRAY It is an ordered collection of fields. The fields must all be of the same type Syntax: ARRAY

Example: array (1, 4)

2.2 MAP It is an unordered collection of key-value pairs. Keys must be primitives,values may be any type. Syntax: MAP

Example: map(‘a’,1,’c’,3)

2.3 STRUCT It is a collection of elements of different types. Syntax: STRUCT

Example: struct(‘a’, 1 1.0)

2.4 UNION It is a collection of Heterogeneous data types. Syntax: UNIONTYPE

Example: create_union(1, ‘a’, 63)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top