This alternative program may help you.
As far as I ca tell, what you need is an output record for each unique speaker labelled SPKR-INFO
, followed by a reformatted version of the original lines labelled SPEAKER
.
The input data you show doesn't seem to correspond with your required output. My program below uses this input
0.000 8.556 speech_L1
8.556 21.063 speech_L2
32.304 9.515 speech_L3
42.049 0.767 speech_L1
The biggest change is that I have abandoned the @rttm
array, as on the face of it you can just print each line to the output file as you come to it.
I have also removed the awkward while
loops that iterate over the array indices. Because there is no need for the value of the index except to access the array element it is simpler and clearer just to interate over the array values directly.
Note also that, if you have autodie
in place, there is no need to test the success of open
calls with an or die...
.
Since you have included the List::MoreUtils
module, I have used the uniq
function instead of coding it using a @uniq
array
use strict;
use warnings;
use autodie;
use List::MoreUtils qw(uniq);
open my $fh, '<', 'etichete';
my $nume = 'etichete';
my @file;
while (<$fh>) {
push @file, [ split ];
}
my @unique_speakers = sort { $a cmp $b } uniq map $_->[2], @file;
open my $out, '>', 'etichete.rttm';
for my $speaker (@unique_speakers) {
print $out join(' ', 'SPKR-INFO', $nume, '1', '<NA>', '<NA>', '<NA>', 'unknown', $speaker, '<NA>'), "\n";
}
for my $line (@file) {
print $out join(' ', 'SPEAKER', $nume, '1', $line->[0], $line->[1], '<NA>', '<NA>', $line->[2], '<NA>'), "\n";
}
close $out;
output
SPKR-INFO etichete 1 <NA> <NA> <NA> unknown speech_L1 <NA>
SPKR-INFO etichete 1 <NA> <NA> <NA> unknown speech_L2 <NA>
SPKR-INFO etichete 1 <NA> <NA> <NA> unknown speech_L3 <NA>
SPEAKER etichete 1 0.000 8.556 <NA> <NA> speech_L1 <NA>
SPEAKER etichete 1 8.556 21.063 <NA> <NA> speech_L2 <NA>
SPEAKER etichete 1 32.304 9.515 <NA> <NA> speech_L3 <NA>
SPEAKER etichete 1 42.049 0.767 <NA> <NA> speech_L1 <NA>