Pregunta

I have a file with several rows of text. Each row contains a text array that is represented as follows

["ABC","D EF","XYZ"]
["MNO","P","QR  ST"]
["A"]
...

Notice, some of the words within quotes have spaces between them. I'm reading it into a perl script that looks like below

while(<stdin>){
  @tmp = split /,/, $_;
  ... do something with @tmp elements.
}

Is there an easy regex way to read all the elements into an array rather than painfully splitting it and stripping quotes and braces?

Thanks in advance

¿Fue útil?

Solución

It is simple to parse each row with a regular expression.

You don't say in what form you want to store the data, but this short program may help.

I have use Data::Dump to display the contents of the @data array after processing the file.

use strict;
use warnings;

my @data;

while (<DATA>) {
  my @fields = /"([^"]*)"/g;
  push @data, \@fields;
}

use Data::Dump;
dd \@data;

__DATA__
["ABC","D EF","XYZ"]
["MNO","P","QR  ST"]
["A"]

output

[["ABC", "D EF", "XYZ"], ["MNO", "P", "QR  ST"], ["A"]]

Otros consejos

may be you better off with json parser? http://search.cpan.org/dist/JSON-Parse/lib/JSON/Parse.pod

Would something like this work?

use strict;
use Data::Dumper;

my @tmp;

while(<stdin>){
      chomp;
      s/[^a-zA-Z\d\s,]//g;
      push (@tmp, split /,/, $_);
}
print Dumper (\@tmp);

output

$VAR1 = [
          'ABC',
          'D EF',
          'XYZ',
          'MNO',
          'P',
          'QR  ST',
          'A'
        ];

Edit

Alternative:

use strict;
use Data::Dumper;

my @tmp;

while(<stdin>){
      chomp;
      s/[^a-zA-Z\d\s,]//g;
      push (@tmp, [split /,/, $_]);
}
print Dumper (\@tmp);

output

$VAR1 = [
          [
            'ABC',
            'D EF',
            'XYZ'
          ],
          [
            'MNO',
            'P',
            'QR  ST'
          ],
          [
            'A'
          ]
        ];
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top