I am trying to load JSON files into Hive using JSON Serde. I am able to get it working for one JSON file at a time, but I was wondering whether it's possible to have more than one record in a JSON file at a time and get them loaded in one shot. To give an idea, my JSON file looks like this:
File 1
{"styles": {"style": "Deep House"}, "genres": {"genre": "Electronic"}}
File 2
{"styles": {"style": "Rock"}, "genres": {"genre": "Techno Rock"}}
I combined them to make one JSON file as follows:
{"styles": {"style": "Deep House"}, "genres": {"genre": "Electronic"}},{"styles": {"style": "Rock"}, "genres": {"genre": "Techno Rock"}}
When I load this file, only the first record is loaded. My table DDL is as below:
create table json_data (
styles struct<style: string>,
genres struct<genre: string>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe';
I use the standard LOAD
command.
LOAD DATA LOCAL INPATH '/home/user/json_data' INTO TABLE json_data;
When I query the table, there's only one record inserted.
select * from json_data;
{"style":"Deep House"} {"genre":"Electronic"}
Time taken: 0.76 seconds
Am I doing something wrong here with the JSON file creation? Or is it not possible to have two records in one JSON file? Any help would be really appreciated.
Thanks, TM