문제

I resort to the following:

A = LOAD 'a.txt' USING PigStorage('\\u001') AS (
    foo:int
    ,bar:chararray
);
B = LOAD 'b.txt' USING PigStorage('\\u001') AS (
    foo:int
    ,baz:long
);
C = JOIN A BY foo, B BY foo;
D = FOREACH C GENERATE
    A::foo AS foo
    ,A::bar AS bar
    ,B::baz AS baz
;

How can I join and define the schema in a single step?

도움이 되었습니까?

해결책

According to the documentation you can't define a schema when joining relations.
Note: Syntactically you can nest commands to have the feeling that you saved some steps like:

D = foreach
    (join (LOAD 'a.txt' USING PigStorage('\\u001') AS (foo:int ,bar:chararray)) by foo,
          (LOAD 'b.txt' USING PigStorage('\\u001') AS (foo:int ,baz:long)) by foo
    ) generate $0 as foo, $1 as bar, $3 as baz;

But I'd avoid doing so. It's chaotic and nonetheless it generates the same explain plan as the original one.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top