I have two files containing data like this:

FILE1 contains group numbers (first column) and the frequency (third column) of their switching another group (second column):

FILE1:

1 2 0.6 
2 1 0.6
3 1 0.4
1 3 0.4
2 3 0.2

etc...

FILE2 contains group numbers (first columns) and their frequency of occurrence (second column).

FILE2:

1 0.9
2 0.7
3 0.5

etc...

I want to make another file containing FILE2 with the values for each switch from FILE1 like this:

1 0.9 2 0.6 3 0.4 ...
2 0.7 1 0.6 3 0.2 ...

Basically, I want first column to be the group number, second the frequency of its occurrence, then the group they switch to and the frequency of that switch, then next switch all in the same line for that particular group, then next line - group 2 etc.

So I want to read in FILE1, make a hash of arrays for each group with keys being group numbers and the values being the group they switch to and the frequency of that switch. I will have one big array for each group containing subarrays of each group they switch to and frequency. Then I want to make another hash with the same keys as in the first hash but with the numbers from the first column in FILE2 and values from the second column of FILE2. Then I will print out "hash2 key hash2 value hash1 whole array for that key". This is my attempt using Perl:

#!/usr/bin/perl -W

$input1= $ARGV[0];
$input2 = $ARGV[1];
$output = $ARGV[2];

%switches=();

open (IN1, "$input1");
while (<IN1>) {
 @tmp = split (/\s+/, $_);
 chomp @tmp;
 $group = shift @tmp;
 $switches{$group} = [@tmp];

 push (@{$switches{$group}}, [@tmp]);

}

close IN1;

%groups=();

open (IN2, "$input2");
while (<IN2>) {
 chomp $_;
 ($group, $pop) = split (/\s+/, $_);
 $groups{$group} = $pop;
}
close IN2;

open (OUT, ">$output");

foreach $group (keys %groups) {
  print OUT "$group $pop @{$switches{$group}}\n"
}

close OUT;

The output I get contains something like:

1 0.1 2 0.1 ARRAY(0x100832330) 
2 0.3 5 0.2 ARRAY(0x1008325d0)

So basically:

"group" "one last frequency number" "one last group that that group switches to" "one last switch frequency" "smth like ARRAY(0x100832330)"

I assume I am doing smth wrong with pushing all switches into the hash of arrays while in FILE1 and also with dereferencing at the end when I print out.

Please help, Thanks!

有帮助吗?

解决方案

Your %switches hash contains redundant information; just use the push. Also, you need to do more work to print out what you want. Here is your code with minimal changes:

$input1= $ARGV[0];
$input2 = $ARGV[1];
$output = $ARGV[2];

%switches=();

open (IN1, "$input1");
while (<IN1>) {
 @tmp = split (/\s+/, $_);
 chomp @tmp;
 $group = shift @tmp;

 push (@{$switches{$group}}, [@tmp]);

}

close IN1;

%groups=();

open (IN2, "$input2");
while (<IN2>) {
 chomp $_;
 ($group, $pop) = split (/\s+/, $_);
 $groups{$group} = $pop;
}
close IN2;

open (OUT, ">$output");

foreach $group (sort {$a <=> $b} keys %groups) {
    print OUT "$group $groups{$group} ";
    for my $aref (@{$switches{$group}}) {
        print OUT "@{$aref}";
    }
    print OUT "\n";
}

close OUT;


__END__


1 0.9 2 0.63 0.4
2 0.7 1 0.63 0.2
3 0.5 1 0.4

See also perldoc perldsc and perldoc Data::Dumper

其他提示

Since each column represents something of value, instead of an array, you should store your data in a more detailed structure. You can do this via references in Perl.

A reference is a pointer to another data structure. For example, you could store your groups in a hash. However, instead of each hash value containing a bunch of numbers separate by spaces, each hash value instead points to an array that contains the data points for that group. And, each of these data points in that array points to a hash whose keys are SWITCH representing their switching and FREQ for their frequency.

You could talk about the frequency of the first data point of Group 1 as:

$data{1}->[0]->{FREQ};

This way, you can more easily manipulate your data -- even if you're simply rewriting it into another flat file. You can also use the Storable module to write your data in a way which saves its structure.

#! /usr/bin/env perl
#
use strict;
use feature qw(say);
use autodie;
use warnings;
use Data::Dumper;

use constant {
    FILE1       => "file1.txt",
    FILE2       => "file2.txt",
};

my %data;  # A hash of an array of hashes (superfun!)

open my $fh1, "<", FILE1;

while ( my $line = <$fh1> ) {
    chomp $line;
    my ( $group, $switch, $frequency ) = split /\s+/, $line;
    if ( not exists $data{$group} ) {
        $data{$group} = [];
    }
    push @{ $data{$group} }, { SWITCH => $switch, FREQ => $frequency };
}
close $fh1;

open my $fh2, "<", FILE2;
while ( my $line = <$fh2> ) {
    chomp $line;
    my ( $group, $frequency ) = split /\s+/, $line;
    if ( not exists $data{$group} ) {
        $data{$group} = [];
    }
    push @{ $data{$group} }, { SWITCH => undef, FREQ => $frequency };
}
close $fh2;
say Dumper \%data;

This will give you:

$VAR1 = {
        '1' => [
                {
                    'SWITCH' => '2',
                    'FREQ' => '0.6'
                },
                {
                    'SWITCH' => '3',
                    'FREQ' => '0.4'
                },
                {
                    'SWITCH' => undef,
                    'FREQ' => '0.9'
                }
                ],
        '3' => [
                {
                    'SWITCH' => '1',
                    'FREQ' => '0.4'
                },
                {
                    'SWITCH' => undef,
                    'FREQ' => '0.5'
                }
                ],
        '2' => [
                {
                    'SWITCH' => '1',
                    'FREQ' => '0.6'
                },
                {
                    'SWITCH' => '3',
                    'FREQ' => '0.2'
                },
                {
                    'SWITCH' => undef,
                    'FREQ' => '0.7'
                }
                ]
        };

This will do what you need.

I apologize for the lack of analysis, but it is late and I should be in bed.

I hope this helps.

use strict;
use warnings;

my $fh;
my %switches;

open $fh, '<', 'file1.txt' or die $!;
while (<$fh>) {
  my ($origin, @switch) = split;
  push @{ $switches{$origin} }, \@switch;
}

open $fh, '<', 'file2.txt' or die $!;
while (<$fh>) {
  my ($origin, $freq) = split;
  my $switches = join ' ', map join(' ', @$_), @{ $switches{$origin} };
  print join(' ', $origin, $freq, $switches), "\n";
}

output

1 0.9 2 0.6 3 0.4
2 0.7 1 0.6 3 0.2
3 0.5 1 0.4

Update

Here is a fixed version of your own code that produces similar results. The main problem is that the values in your %switches arrays of arrays, so you have to do two dereferences. I've fixed that by adding @switches, which contains the same contents as the current %switches value, but has strings in place of two-element arrays.

I've also added use strict and use warnings, and declared all your variables properly. The open calls have been changed to the three-argument open with lexical file handles as they should be, and they are now being checked for success. I've changed your split calls, as a simple bare split with no parameters is all you need. And I've removed your @tmp and used proper list assignments instead. Oh, and I've changed the wasteful [@array] to a simple \@array (which wouldn't have worked without declaring variables using my).

I still think my version is better, if only because it's much shorter, and yours prints the groups in random order.

#!/usr/bin/perl

use strict;
use warnings;

my ($input1, $input2, $output) = @ARGV;

my %switches;

open my $in1, '<', $input1 or die $!;
while (<$in1>) {
  my ($group, @switches) = split;
  push @{ $switches{$group} }, \@switches;
}

close $in1;

my %groups;

open my $in2, '<', $input2 or die $!;
while (<$in2>) {
 my ($group, $pop) = split;
 $groups{$group} = $pop;
}
close $in2;

open my $out, '>', $output or die $!;
for my $group (keys %groups) {
  my $pop = $groups{$group};
  my @switches = map "@$_", @{ $switches{$group} };
  print $out "$group $pop @switches\n"
}
close $out or die $!;
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top