I am trying to process some log files on a bucket in amazon s3.
I create the table :
CREATE EXTERNAL TABLE apiReleaseData2 (
messageId string, hostName string, timestamp string, macAddress string DISTINCT, apiKey string,
userAccountId string, userAccountEmail string, numFiles string)
ROW FORMAT
serde 'com.amazon.elasticmapreduce.JsonSerde'
with serdeproperties ( 'paths'='messageId, hostName, timestamp, macAddress, apiKey, userAccountId, userAccountEmail, numFiles')
LOCATION 's3://apireleasecandidate1/regression/transferstatistics/2013/12/31/';
Then I run the following HiveQL statement and get my desired output in the file without any issues. My directories are setup in the following manner :
s3://apireleasecandidate1/regression/transferstatistics/2013/12/31/ < All the log files for this day >
What I want to do is that I specify the LOCATION up to the 's3://apireleasecandidate1/regression/transferstatistics/' and then call the
ALTER TABLE <Table Name> ADD PARTITION (<path>)
statement or the
ALTER TABLE <Table Name> RECOVER PARTITIONS ;
statement to access the files in the subdirectories. But when I do this there is no data in my table.
I tried the following :
CREATE EXTERNAL TABLE apiReleaseDataUsingPartitions (
messageId string, hostName string, timestamp string, macAddress string, apiKey string,
userAccountId string, userAccountEmail string, numFiles string)
PARTITIONED BY (year STRING, month STRING, day STRING)
ROW FORMAT
serde 'com.amazon.elasticmapreduce.JsonSerde'
with serdeproperties ( 'paths'='messageId, hostName, timestamp, macAddress, apiKey, userAccountId, userAccountEmail, numFiles')
LOCATION 's3://apireleasecandidate1/regression/transferstatistics/';
and then I run the following ALTER command :
ALTER TABLE apiReleaseDataUsingPartitions ADD PARTITION (year='2013', month='12', day='31');
But running the Select statement on the table gives out no results.
Can someone please guide me what I am doing wrong ?
Am I missing something Important ?
Cheers
Tanzeel