Question

I am reading a file using perl script. This file consists of strings with different characters and I am supposed to identify strings containing the character 'X'. I want to know how should I (1) print this string (containing 'X') and also (2) write this string to another file (3) count the number of 'X' characters in the whole file. The script below prints the whole file again. Any suggestions?

#!/use/bin/perl
use strict;
use warnings;

open (FILE, "/home/user/Desktop/infile.phy") || die "cant open file\n";
my @body = <FILE>;
close (FILE);
my $count= 0;
my $string = '';
foreach $_(@body){
    if ($_ =~ m/[X]/){
        print "$_";
        $count++;
        print $count;
    }
    else {
        print ;
    }
}
exit;
Was it helpful?

Solution

Since this is code review, let's go one by one:

#!/use/bin/perl

That shebang line is most likely a typo. It should probably be

#!/usr/bin/perl

or whatever which perl returns on your system.

use strict;
use warnings;

Good.

open (FILE, "/home/user/Desktop/infile.phy") || die "cant open file\n";

No need for package global filehandles when you can use lexical filehandles. The 3-argument form of open is preferable these days. Also, the error message should indicate the file which you could not open:

my $filename = '/home/user/Desktop/infile.phy';
open my $input, '<', $filename
    or die "Cannot open '$filename' for reading: $!";

my @body = <FILE>;

You are slurping the file into an array. That is completely unnecessary in this case.

my $count  = 0;
my $string = '';

Declare and initialize (if necessary) any variables in the smallest possible scope.

my $count;

The variable $string is not used anywhere else in your code.

foreach $_(@body){

This is silly. for uses $_ if no loop variable is specified. It is easier to keep things straight if you instead specify a lexical loop variable.

for my $line ( @body ) {

However, I do not think you should slurp the file.

        if ($_ =~ m/[X]/){

That results in a successful match if the line contains an X. So, it is equivalent to /X/. However, that will not tell you the word that contained the 'X'. For that, you need to decide what a word is and do your matching at the word level.

With all that in mind, consider the following script. I have made a simplifying assumption regarding what I consider to be a word. You should be able to build on this to satisfy all the requirements:

#!/usr/bin/perl

use strict;
use warnings;

my $filename = "$ENV{TEMP}/test.txt";
open my $input, '<', $filename
    or die "Cannot open '$filename' for reading: $!";

my $count;

while ( my $line = <$input> ) {
    my @words = grep { /X/ } split /\b/, $line;
    $count += @words;
    print join(', ', @words), "\n";
}

print "$count\n";

__END__

UPDATE: If you do not care about finding the words within each line that have one or more X characters, the while loop would be simplified:

while ( <$input> ) { 
    $count += (my @matches = /(X)/g );
    print if @matches;
}

by using $_. That, however, is probably inefficient (given that we are saving each matched X character). In this case, tr works best:

my ($count, $n);
$n = tr/X// and $count += $n and print while <$input>;

OTHER TIPS

You are printing $_ in both branches of your if-clause. Get rid of the else branch.

Assuming "string" in your question equals "line":

use strict;
use warnings;

@ARGV=qw(/home/user/Desktop/infile.phy);

my $count = 0;
open my $outfile, '>', 'outfile' or die $!;
while (<>) {
  my $cnt = tr/X/X/;
  if ($cnt) {
    print;
    print $outfile $_;
  }
  $count += $cnt;
}

close $outfile or die $!;

print $count;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top