As threatened, the following is a cleaned up version of your script:
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
die "Usage: $0 Filename\n" if @ARGV != 1;
my $file = shift;
open my $infh, '<', $file;
open my $outfh, '>', 'AtPTCindex.txt';
while (my $line = <$infh>) {
chomp($line);
my $result = '';
for (my $i = 0; $i < (length($line) - 2); $i += 3) {
my $codon = substr($line, $i, 3);
if ($codon =~ m/TAG|TGA|TAA/) {
# m added before /TAG... above
my $stopseq = substr($line, $i, 9);
$result = index($line, $stopseq);
$result .= " ($i, $codon, $stopseq)";
last;
}
}
print "$result\n";
# print $outfh "$result\n";
# print $outfh "$result $.\n";
}
close $infh;
close $outfh;
For the 5 lines of data that you provided, the following is the output:
84 (84, TGA, TGATCATTC)
3 (3, TGA, TGATTGTTT)
3 (3, TAA, TAATTTTTG)
4 (27, TAG, TAGTAAACA)
123 (123, TAA, TAAGATTAA)
I believe your issue is with these lines:
my $stopseq = substr($line, $i, 9);
$result = index($line, $stopseq);
You're pulling a sequence from the $line
at position $i
, and then immediately doing an index
for it. In the case of 4 of 5 of those lines, it immediately finds the same value $i
. However, in the case of line 4, it finds a matching sequence earlier in the line.
If this isn't desired, you'll have to explain what your desired behavior actually is. Perhaps, you just want $i
? Or are you looking for a matching stop sequence any point AFTER $i
? You'll have to specify what your actual logic wants to be.