Question

I need a subroutine that, given a set of characters, will generate all possible combinations of those characters of length k. Order matters and reuse is allowed, so if k = 2 then AB != BA and AA is an option. I found some working examples on PerlMonks, but unfortunately they are code golf and not easy for me to wrap my mind around. Can someone please do one or more of the following?

  1. Give a breakdown and explanation of how the first algorithm works.
  2. De-obfuscate the code so that the meaning is clearer.
  3. Point me toward another example that is clearer.

Thanks!

Was it helpful?

Solution

You can use variations_with_repetition from Algorithm::Combinatorics (which also provides an iterator-based interface), but if you just need a list, this is a fairly simple recursive algorithm:

sub ordered_combinations
{
  my ($data, $k) = @_;

  return @$data if $k == 1;

  my @previous = ordered_combinations($data, $k-1);

  my @results;
  for my $letter (@$data) {
    push @results, map { $letter . $_ } @previous;
  }

  return @results;
} # end ordered_combinations

print "$_\n" for ordered_combinations([qw(a b c)], 3);

This is basically the same algorithm the code golfers are using, but I'm using a for loop instead of nesting map. Also, I recurse only once per level (code golf is about minimizing source code, not runtime).

Any recursive function can be converted to an iterative one, which usually reduces its overhead. This one is fairly simple:

sub ordered_combinations
{
  my ($data, $k) = @_;

  return if $k < 1;

  my $results = $data;

  while (--$k) {
    my @new;
    for my $letter (@$data) {
      push @new, map { $letter . $_ } @$results;
    } # end for $letter in @$data

    $results = \@new;
  } # end while --$k is not 0

  return @$results;
} # end ordered_combinations

This version handles the $k == 0 case, which the original did not.

OTHER TIPS

I had a look at the very first piece of code on the page you referred to:

sub c{my$n=-1+shift;$n?map{my$c=$_;map$c.$_,c($n,@_)}@_:@_}

I have spread it out a bit to make it more readable; also I have made some changes to it to make it clearer (see combinations):

#!/usr/bin/perl

use strict;
use warnings;

sub c {
   my $n=-1+shift;
   $n ? map{
             my $c = $_;
             map $c . $_ , c($n ,@_)
           } @_
   : @_;
}

sub combinations {
   my $number = shift; # remove the first item from @_
   my @chars  = @_;    # the remainder of @_

   $number --; # decrement $number, so that you will eventually exit
               # from this recursive subroutine (once $number == 0)

   if ($number) { # true as long as $number != 0 and $number not undef

      my @result;

      foreach my $char (@chars) {
         my @intermediate_list = map { $char . $_ } combinations($number, @chars);
         push @result, @intermediate_list;
      }

      return @result; # the current concatenation result will be used for creation of
                      # @intermediate_list in the 'subroutine instance' that called 'combinations'
   }
   else {
      return @chars;
   }
}

print join " ", combinations(2, "A", "B");
print "\n";
print join " ", c(2, "A", "B");
print "\n\n";
print join " ", combinations(3, "A", "B");
print "\n";
print join " ", c(3, "A", "B");
print "\n";

Both versions work in the same way, and they produce exactly the same output:

AA AB BA BB
AA AB BA BB

AAA AAB ABA ABB BAA BAB BBA BBB
AAA AAB ABA ABB BAA BAB BBA BBB

I included some comments in the code, but perhaps a lengthier explanation is in order!? Well, here's an example to illustrate how things work: let's say we've got two items, "A" and "B", and we want to get all possible combinations of 2 of these items. In that case, $number will initially be equal to 2 (as we want to get pairs), and @chars will be equal to ('A', 'B').

The first time combinations is called, $number is decremented to 1, thus the if condition is met, and we enter the foreach loop. This first sets $char to 'A'. It then calls combinations(1, ('A', 'B')). As $number always gets decremented when the subroutine is called, $number is 0 in this 'child subroutine', consequently the child simply returns ('A', 'B'). Thus:

@intermediate_list = map { $char . $_ } ('A', 'B'); # $char eq 'A'

map then takes both 'A' and 'B' and concatenates each with 'A' ($char), thus @intermediate_list is ('AA', 'AB'). In the next round of the foreach loop, the same is done with $char = B, which sets @intermediate_list to ('BA', 'BB').

In each round the contents of @intermediate_list are pushed into the result list, hence @result eventually contains all possible combinations.

If you want to get triplets instead of pairs, you will obviously start with $number = 3, and combinations will be called three times. The second time it's called it will return @result, i.e. a pair-containing list. Each item from that list will be concatenated with each character of the initial character set.

Okay, I hope this makes sense. Please ask in case something has not become clear.

EDIT: Please see ysth's comment below.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top