문제

Please help improve the following code . I am not able to print the sequence in one single line. Would like to have output printed in four lines each with nucleotide frequency of one of the four characters. Thanks in advance.enter code here

#!/usr/bin/perl
use strict;
use warnings;
my $A;    
my $T;
my $G;
my $C;
my $fileIN;
my $fileOUT;

my $seq ;
open ($fileIN, "basecount.nfasta") or die "can't open file ";
open ($fileOUT, ">basecount.out") or die "can't open file ";

while (<$fileIN>)
{

             if ($_ =~/^>/)  #ignore header line
             {next;}

             else
                   {
                    $seq  = $_; #copy the all line with only nucleotide characters ATGC
                   }
            $seq  =~ s/\n//g; #create one single line containing all ATGC characters

             print "$seq\n"; # verify previous step

             my @dna = split ("",$seq); #create an array to include each nucleotide as array element

             foreach my $element (@dna)

            {
            if ($element =~/A/) # match nucleotide pattern and countstrong text
                            {
                             $A++;
                            }
             if ($element =~/T/)
                            {
                             $T++;
                            }
             if ($element =~/G/)
                            {
                             $G++;
                            }
             if ($element =~/C/)
                            {
                             $C++;
                            }

            }

            print $fileOUT "A=$A\n";
            print $fileOUT "T=$T\n";
            print $fileOUT "G=$G\n";
            print $fileOUT "C=$C\n";
}

close ($fileIN);
close ($fileOUT);
도움이 되었습니까?

해결책

At first, i would use some shortcuts. Its easier to read:

use strict;
use warnings;
use feature 'say';
my $A;
my $T;
my $G;
my $C;
my $fileIN;
my $fileOUT;

open $fileIN,  '<',"basecount.nfasta" or die "can't open file basecount.nfasta for reading";
open $fileOUT, '>','basecount.out' or die "can't open file basecount.out for writing";

while ( my $seq = <$fileIN> ) {

  next if $seq =~ /^>/;
  $seq =~ s/\n//g;
  say $seq;

  my @dna = split //, $seq;

  foreach my $element ( @dna ) {
    $A++ if $element =~ m/A/;
    $T++ if $element =~ m/T/;
    $G++ if $element =~ m/G/;
    $C++ if $element =~ m/C/;
  }

  say $fileOUT "A=$A";
  say $fileOUT "T=$T";
  say $fileOUT "G=$G";
  say $fileOUT "C=$C";
}

close $fileIN;
close $fileOUT;

Using the 3 statement open is also recommended ( and a good die warning as well ).

EDIT: I used use feature 'say' here because all of your prints end with a newline. say does exactly the same like print, just with adding newlines at the end.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top