Question

I have data like

"scott
E -45  COLLEGE LANE
BENGALI MARKET
xyz  -785698."
"Tomm
D.No: 4318/3,Ansari Road, Dariya Gunj,
xbc - 289235."

I wrote one Perl program to extract names i.e;

open(my$Fh, '<', 'printable address.txt') or die "!S";
open(my$F, '>', 'names.csv') or die "!S";
while (my@line =<$Fh> ) {
    for(my$i =0;$i<=13655;$i++){
        if ($line[$i]=~/^"/) {
        print $F $line[$i];
        }

    }
}

It works fine and it extracts names exactly .Now my aim is to extract address that is like

BENGALI MARKET
xyz  -785698."
D.No: 4318/3,Ansari Road, Dariya Gunj,
xbc - 289235."

In CSV file. How to do this please tell me

Was it helpful?

Solution

There are a lot of flaws with your original problem. Should address those before suggesting any enhancements:

  1. Always have use strict; and use warnings; at the top of every script.
  2. Your or die "!S" statements are broken. The error code is actually in $!. However, you can skip the need to do that by just having use autodie;
  3. Give your filehandles more meaningful names. $Fh and $F say nothing about what those are for. At minimum label them as $infh and $outfh.
  4. The while (my @line = <$Fh>) { is flawed as that can just be reduced to my @line = <$Fh>;. Because you're going readline in a list context it will slurp the entire file, and the next loop it will exit. Instead, assign it to a scalar, and you don't even need the next for loop.
  5. If you wanted to slurp your entire file into @line, your use of for(my$i =0;$i<=13655;$i++){ is also flawed. You should iterate to the last index of @line, which is $#line.
  6. if ($line[$i]=~/^"/) { is also flawed as you leave the quote character " at the beginning of your names that you're trying to match. Instead add a capture group to pull the name.

With the suggested changes, the code reduces to:

use strict;
use warnings;
use autodie;

open my $infh, '<', 'printable address.txt';
open my $outfh, '>', 'names.csv';

while (my $line = <$infh>) {
    if ($line =~ /^"(.*)/) {
        print $outfh "$1\n";
    }
}

Now if you also want to isolate the address, you can use a similar method as you did with the name. I'm going to assume that you might want to build the whole address in a variable so you can do something more complicated with it than throwing them blindly at a file. However, mirroring the file setup for now:

use strict;
use warnings;
use autodie;

open my $infh, '<', 'printable address.txt';
open my $namefh, '>', 'names.csv';
open my $addressfh, '>', 'address.dat';

my $address = '';

while (my $line = <$infh>) {
    if ($line =~ /^"(.*)/) {
        print $namefh "$1\n";

    } elsif ($line =~ /(.*)"$/) {
        $address .= $1;
        print $addressfh "$address\n";
        $address = '';

    } else {
        $address .= $line;
    }
}

Ultimately, no matter what you want to use your data for, your best solution is probably to output it to a real CSV file using Text::CSV. That way it can be imported into a spreadsheet or some other system very easily, and you won't have to parse it again.

use strict;
use warnings;
use autodie;

use Text::CSV;

my $csv = Text::CSV->new ( { binary => 1, eol => "\n" } ) 
    or die "Cannot use CSV: ".Text::CSV->error_diag ();

open my $infh, '<', 'printable address.txt';
open my $outfh, '>', 'address.csv';

my @data;

while (my $line = <$infh>) {
    # Name Field
    if ($line =~ /^"(.*)/) {
        @data = ($1, '');

    # End of Address        
    } elsif ($line =~ /(.*)"$/) {
        $data[1] .= $1;
        $csv->print($outfh, \@data);

    # Address lines     
    } else {
        $data[1] .= $line;
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top