Pergunta

I'm still beginning to learn pig, so pardon me. Here's the question.. How do I remove a data value with spaces in between?

This is the data:

2 035
356
5 312
62
data = LOAD 'sample.csv' AS (number:chararray);

processed = FOREACH data GENERATE number;

DUMP processed;

How can I edit in the script such that I can remove the spaces in 5 312 and returns as an integer?

Foi útil?

Solução

Here is the solution -

data = load 'sample.csv' as (number:chararray);
b = FOREACH data GENERATE (LONG) REPLACE(number, ' ', '');

I have used the inbuilt filter function REPLACE to get the desired output -

chararray REPLACE(chararray source, chararray toReplace, chararray newValue) Parameters: source: the chararray to search in toReplace: the chararray to be replaced newValue: the new chararray to replace it with Returns: source with all instances of toReplace changed to newValue

Hope this helps

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top