Pergunta

I have a pig relation that reads some thing like -

describe A;
A:{header:(member_id, field_2,..)}

Now I want to tease out just the members so I do -

A1 = FOREACH A GENERATE A.header.member_id;
A2 = LIMIT A1 10;
dump A2;

This runs for a very long time culminating in the error - Unable to open iterator for alias A2. Backend error : Scalar has more than one row in the output.

What am I doing wrong?

Foi útil?

Solução

The issue is with the line:

 A1 = FOREACH A GENERATE A.header.member_id;

You shouldn't reference A in A.header.member_id. Pig is doing operations over each tuple in A so it only sees the values in each tuple (in this case only header). Since Pig doesn't see A in this scope it checks to see if it can use a relation (In your example A, A1, and A2 are relations) instead. However, it can only use that relation if it has one row; if it doesn't it creates the error you encountered.

The solution is just to change A1 to:

 A1 = FOREACH A GENERATE header.member_id;
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top