Question

Sorry for all the setup. This is a hive datatype and comment question.

I have a single file in HDFS which combines 4 sets of table data. Breaking the data out ahead of time is not my preferred option. The first 4 rows specify the column headers:

*1 col1, col2, col3 *2 cola, colb, colc, cold, col5e etc....

data rows begin with matching number at position 1 of the header.

1 data, data, data, 2 data, data, data, data, data, etc...

The base hive table is just col0 - col60 for the raw file. I've tried creating a CTAS table to hold all of the "1" columns and one for the "2" columns where I can specify data type, and comments. Since the column names vary, I cannot give the columns names on the base table nor can I comment them with column based metadata.

This DDL didn't work but giving an example of what I'm hoping to do. Any thoughts ?

CREATE TABLE foo (
col1 as meaningful_name string comment 'meaningful comment')
as
SELECT col1 
FROM base_hive table
WHERE col1 = 1;

CREATE TABLE foo 
as
SELECT col1 string comment 'meaningful comment'
FROM base_hive table
WHERE col1 = 1;

thanks TD

No correct solution

OTHER TIPS

I dont understand much what you are trying to achieve here, but looking at your DDL, I can see some errors. For the correct CREATE TABLE AS SELECT implementation, pl use the below DDL:

CREATE TABLE foo ( col1 STRING COMMENT 'meaningful comment') AS SELECT col1 AS meaningful_name FROM base_hive table WHERE col1 = 1;

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top