Question

I load my files and generate dates for them in two steps:

A = LOAD 'foo.txt' USING PigStorage('\\u001') AS (
    bar:chararray
    ,baz:int
);
B = FOREACH A GENERATE
    ToDate(bar, 'yyyy-MM-dd HH:mm:ss') AS bar
    ,baz
;

How can I do it in one step?

The following throws an error (mismatched input '(' expecting RIGHT_PARAN:

A = LOAD 'foo.txt' USING PigStorage('\\u001') AS (
    ToDate(bar:chararray, 'yyyy-MM-dd HH:mm:ss') AS bar
    ,baz:int
);

or

A = LOAD 'foo.txt' USING PigStorage('\\u001') AS (
    ToDate($0, 'yyyy-MM-dd HH:mm:ss') AS bar
    ,baz:int
);
Was it helpful?

Solution

UDFs can't be applied in the schema definition. You may write your own loader instead.

OTHER TIPS

Try using ^A instead of \\u001. For more information, use this link.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top