Pergunta

I have strings of this kind

NAME1              NAME2          DEPTNAME           POSITION
JONH MILLER        ROBERT JIM     CS                 ASST GENERAL MANAGER 

I want the output to be name1 name2 and position how can i do it using split/regex/trim/etc and without using CPAN modules?

Foi útil?

Solução

If your input data comes in as an array of strings (@strings), this

for my $s (@strings) {
   my $output = join ' ',
                map /^\s*(.+)\s*$/ ? $1 : (),
                unpack('A19 A15 x19 A*', $s);
   print "$output\n"
}

would extract and trim the information needed.

NAME1 | NAME2 | POSITION

and

JONH MILLER | ROBERT JIM | ASST GENERAL MANAGER

(The '|' were included by me for better expalnation of the result)

Regards

rbo

Outras dicas

It's going to depend on whether those are fixed length fields, or if they are tab separated. The easiest (using split) is if they are tab separated.

my ($name1, $name2, $deptName, $position) = split("\t", $string);

If they're fixed length, and assuming they are all, say, 10 characters long, you can parse it like

my ($name1, $name2, $deptName, $position) = unpack("A10 A10 A10 A10", $string);

Assuming that space between the fields are not fixed so split string on the basis of two or more spaces so that it will not break the Name like JONH MILLER into two parts.

#!/usr/bin/perl
use strict;
use warning;
my $string = "NAME1              NAME2          DEPTNAME           POSITION
             JONH MILLER        ROBERT JIM     CS                 ASST GENERAL MANAGER ";
my @string_parts = split /\s\s+/, $string;
foreach my $test (@string_parts){  
      print"$test\n";
}

From the sample there, a single space belongs in the data, but 2 or more contiguous spaces do not. So you can easily split on 2 or more spaces. The only thing I add to this is the use of List::MoreUtils::mesh

use List::MoreUtils qw<mesh>;
my @names   = map { chomp; $_ } split /\s{2,}/, <$file>;
my @records = map { chomp; { mesh( @names, @{[ split /\s{2,}/ ]} ) } } <$file>;

Consider using autosplit in a Perl one-liner from your command line:

$ perl -F/\s{2,}/ -ane 'print qq/@F[0,1,3]\n/' file

The one-liner will split on two or more consecutive spaces and print the first, second and fourth fields, corresponding to NAME1, NAME2 and POSITION fields.

Of course, this will break if you have only a single space separating NAME1 and NAME2 entries, but more information is needed about your file in order to ascertain what the best course of action might be.

To split on whitespace:

@string_parts = split /\s{2,}/, $string;

This will split $string into a list of substrings. The separator will be the regex \s+, which means one or more whitespace characters. This includes spaces, tabs, and (unless I'm mistaken) newlines.

Edit: I see that one of the requirements is not to split on only one space, but to split on two or more. I modified the regex accordingly.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top