Domanda

I'm trying to use a FLOAT as a Timestamp for Hive but I'm getting an IllegalArgumentException. According to the documentation this should be read as seconds in Unix time with decimal precision: https://cwiki.apache.org/Hive/languagemanual-types.html#LanguageManualTypes-Timestamps I'm using Hive 0.10.

Schema: `create external table BT_PP (

>     Name STRING,

>     Application STRING,

>     PathId STRING,

>     StartTime TIMESTAMP,

>     Dimensions MAP<STRING, STRING>,

>     Values MAP<STRING, DOUBLE>,

>     Failed BOOLEAN,

>     VisitId BIGINT,

>     ResponseTime DOUBLE,

>     Duration DOUBLE,

>     CpuTime DOUBLE,

>     ExecTime DOUBLE,

>     SuspensionTime DOUBLE,

>     SyncTime DOUBLE,

>     WaitTime DOUBLE

> )

> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\;' ESCAPED BY '\\' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY '='`

Example raw row: Web Page Requests;xxx;PT\=115734\;PA\=314959848\;PS\=1725166795;1378315124.621;Complete Uri Path=/<...>;;false;;1616.58935546875;1616.58935546875;642.5269893486796;927.1752303076349;;;

Query: select * from bt_pp where datediff(from_unixtime(unix_timestamp()), startTime) < 2;

Error: 2013-09-09 12:36:25,687 INFO org.apache.hadoop.mapred.TaskStatus: task-diagnostic-info for task attempt_201308221633_0005_m_000001_3 : java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"Web Page Requests","application":"xxx","pathid":"PT=115734;PA=314959848;PS=1725166795","starttime":"1969-12-31 19:00:00","dimensions":{"Complete Uri Path":"/<...>"},"values":{},"failed":false,"visitid":null,"responsetime":2189.27880859375,"duration":2189.27880859375,"cputime":353.1250106477223,"exectime":940.3325603696519,"suspensiontime":null,"synctime":null,"waittime":null} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"name":"Web Page Requests","application":"xxx","pathid":"PT=115730;PA=314959848;PS=1725166795","starttime":"1969-12-31 19:00:00","dimensions":{"Complete Uri Path":"/<...>"},"values":{},"failed":false,"visitid":null,"responsetime":2189.27880859375,"duration":2189.27880859375,"cputime":353.1250106477223,"exectime":940.3325603696519,"suspensiontime":null,"synctime":null,"waittime":null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:673) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:141) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating starttime at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:80) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:132) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:654) ... 9 more Caused by: java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff] at java.sql.Timestamp.valueOf(Timestamp.java:185) at org.apache.hadoop.hive.serde2.lazy.LazyTimestamp.init(LazyTimestamp.java:74) at org.apache.hadoop.hive.serde2.lazy.LazyStruct.uncheckedGetField(LazyStruct.java:219) at org.apache.hadoop.hive.serde2.lazy.LazyStruct.getField(LazyStruct.java:192) at org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldData(LazySimpleStructObjectInspector.java:188) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.evaluate(ExprNodeColumnEvaluator.java:98) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76) ... 18 more

The error references the timestamp, but uses the Epoch (EST) instead of the value in the row. You can see in the error "Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating starttime" that it definitely has something to do with that field.

È stato utile?

Soluzione

According to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Timestamps

Timestamps in text files have to use the format yyyy-mm-dd hh:mm:ss[.f...]. If they are in another format declare them as the appropriate type (INT, FLOAT, STRING, etc.) and use a UDF to convert them to timestamps.

So you can't simply use a seconds-after-the-epoch float (or integer) value in a TIMESTAMP field, you need to convert it.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top