Question

I have a Perl script which nests foreach loops as seen below. It takes a long time:

#! /usr/bin/perl

use strict;
use warnings;

my @sites = ('a', 'b', 'c');
my @servers = ('A', 'B');
my @data_type = ("X", "Y", "Z");

foreach my $site (@sites) {
    foreach my $server_type (@servers) {
        foreach my $data (@data_type) {
            #statements
        }
    }
}

Nesting foreach statements like this takes a long time and it's hard to read and not very pretty. Can anyone suggest a better way to code this structure using hashes, or some other clever structure?

Was it helpful?

Solution

I don't see what your problem is, but you could use a generic Cartesian product if you are used to SQL or something:

sub cartesian {
    my @C = map { [ $_ ] } @{ shift @_ };
    foreach (@_) {
        my @A = @$_;
        @C = map { my $n = $_; map { [ $n, @$_ ] } @C } @A;
    }
    return @C;
}

my @sites = ('a', 'b', 'c');
my @servers = ('A', 'B');
my @data_type = ("X", "Y", "Z");

foreach (cartesian(\@sites, \@servers, \@data_type)) {
    ($data, $server_type, $site) = @$_;
    print "$site $server_type $data\n";
}

OTHER TIPS

Use my Set::CrossProduct module, or use Algorithm::Loops. You shouldn't have to create hard-coded, nested structures to deal with these issues. Both of those modules can do it for you for an arbitrary number of arrays.

use Set::CrossProduct;

my @sites = ('a', 'b', 'c');
my @servers = ('A', 'B');
my @data_type = ("X", "Y", "Z");

my $cross = Set::CrossProduct->new( 
    [ \@sites, \@servers, \@data_type ]
    );

while( my $tuple = $cross->get ) {
    print "@$tuple\n";
    }

Not only that, but the cursor gives you ways to move around in the iterator so you don't have to limit yourself to the current combination. You can inspect the previous and next combinations, which might be useful for boundaries (like where the next tuple is a different server).

Watch out for people who want to create all of the combinations in memory. There's no need to do that either.

You could simply use for.

(sorry, couldn't resist)

foreach is preferable because it's readable. What exactly do you mean by "each array can cause problems" (what problems?) and "values can mismatch" (what values?)

If I understand your question correctly then you asking how to use hashes with foreach to avoid mismatches that you would have in your array example?.

If so then here is one example:

use strict;
use warnings;

my %sites = (

    a => { 
        A => {
            data_type => [ 'X', 'Y' ],
        }
    },

    b => {
        B => {
            data_type => [ 'Y', 'Z' ],
        }
    },

    c => {

    },
);

for my $site ( keys %sites ) {
    for my $server ( keys %{ $sites{ $site } } ) {
        for my $data ( keys %{ $sites{ $site }{ $server } } ) {
            my @data_types = @{ $sites{ $site }{ $server }{ data_type } };
            say "On site $site is server $server with $data @data_types";
        }
    }
}


You can also use while & each which does produces easier code on the eye:

while ( my ( $site, $site_info ) = each %sites ) {
    while ( my ( $server, $server_info ) = each %{ $site_info } ) {
        my @data_types = @{ $server_info->{data_type} };
        say "On site $site we have server $server with data types @data_types"
            if @data_types;
    }
}

Also note I removed last loop in above example because its currently superfluous with my example hash data.

NB. If you plan to amend keys or break out of loop then please read up on each and how it affects the iteration.

PS. This example is not about the loop but about data being best represented as a Hash and not an Array! (though its not clear 100% from question that is so!).

The only concern I might have when using nested loops is some ambiguity in what $_ is. Considering that you're not even using it, I don't think there's a better way to do what you want.

As a sidenote, I'd like to add that $_ is well defined in this case, but as a programmer I may not want to deal with the overhead of remembering what it refers to at each step.

Do you have any specific concerns with the code?

You can use a classic for loop instead.

for(my $i = 0; $i <= $#sites; $i++){
    for(my $j = 0; $j <= $#servers; $j++){
        for(my $k = 0; $k <= $#data_type; $k++){
            do_functions ...

But that still leaves the problems and mismatches you were reffering to. I suggest you handle these issues in the do_functions part.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top