Question

If I got a table in a text file such like

  • A B 1
  • A C 2
  • A D 1
  • B A 3
  • C D 2
  • A E 1
  • E D 2
  • C B 2
  • . . .
  • . . .
  • . . .

And I got another symbol list in another text file. I want to transform this table into a Perl data structure like:

  • _ A D E . . .
  • A 0 1 1 . . .
  • D 1 0 2 . . .
  • E 1 2 0 . . .
  • . . . . . . .

But I only need some selected symbol, for example A, D and E are selected in the symbol text but B and C are not.

Was it helpful?

Solution

Use an array for the first one and a 2-dimentional hash for the second one. The first one should look roughly like:

$list[0] # row 1 - the value is "A B 1"

And the hash like:

$hash{A}{A} # the intersection of A and A - the value is 0

Figuring out how to implement a problem is about 75% of the mental battle for me. I'm not going to go into specifics about how to print the hash or the array, because that's easy and I'm also not entirely clear on how you want it printed or how much you want printed. But converting the array to the hash should look a bit like this:

foreach (@list) {
  my ($letter1, $letter2, $value) = split(/ /);
  $hash{$letter1}{$letter2} = $value;
}

At least, I think that's what you're looking for. If you really want you could use a regular expression, but that's probably overkill for just extracting 3 values out of a string.

EDIT: Of course, you could forgo the @list and just assemble the hash straight from the file. But that's your job to figure out, not mine.

OTHER TIPS

you can try this with awk:

awk -f matrix.awk yourfile.txt > newfile.matrix.txt

where matrix.awk is :

BEGIN {
   OFS="\t"
}
{
  row[$1,$2]=$3
  if (!($2 in f2)) { header=(header)?header OFS $2:$2;f2[$2]}
  if (col1[c]!=$1)
     col1[++c]=$1
}
END {
  printf("%*s%s\n", length(col1[1])+2, " ",header)
  ncol=split(header,colA,OFS)
  for(i=1;i<=c;i++) {
    printf("%s", col1[i])
    for(j=1;j<=ncol;j++)
      printf("%s%s%c", OFS, row[col1[i],colA[j]], (j==ncol)?ORS:"")
  }
}

Another way to do this would be to make a two-dimensional array -

my @fArray = ();
## Set the 0,0th element to "_"
push @{$fArray[0]}, '_';

## Assuming that the first line is the range of characters to skip, e.g. BC
chomp(my $skipExpr = <>);

while(<>) {
    my ($xVar, $yVar, $val) = split;

    ## Skip this line if expression matches
    next if (/$skipExpr/);

    ## Check if these elements have already been added in your array
    checkExists($xVar);
    checkExists($yVar);

    ## Find their position 
    for my $i (1..$#fArray) {
        $xPos = $i if ($fArray[0][$i] eq $xVar);
        $yPos = $i if ($fArray[0][$i] eq $yVar);
    }

    ## Set the value 
    $fArray[$xPos][$yPos] = $fArray[$yPos][$xPos] = $val;
}

## Print array
for my $i (0..$#fArray) {
    for my $j (0..$#{$fArray[$i]}) {
        print "$fArray[$i][$j]", " ";
    }
    print "\n";
}

sub checkExists {
    ## Checks if the corresponding array element exists,
    ## else creates and initialises it.
    my $nElem = shift;
    my $found;

    $found = ($_ eq $nElem ? 1 : 0) for ( @{fArray[0]} );

    if( $found == 0 ) {
        ## Create its corresponding column
        push @{fArray[0]}, $nElem;

        ## and row entry.
        push @fArray, [$nElem];

        ## Get its array index
        my $newIndex = $#fArray;

        ## Initialise its corresponding column and rows with '_'
        ## this is done to enable easy output when printing the array
        for my $i (1..$#fArray) {
            $fArray[$newIndex][$i] = $fArray[$i][$newIndex] = '_';
        }

        ## Set the intersection cell value to 0
        $fArray[$newIndex][$newIndex] = 0;
    }
}

I am not too proud regarding the way I have handled references but bear with a beginner here (please leave your suggestions/changes in comments). The above mentioned hash method by Chris sounds a lot easier (not to mention a lot less typing).

CPAN has many potentially useful suff. I use Data::Table for many purposes. Data::Pivot also looks promising, but I have never used it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top