Question

The problem is that I have n command-line arguments. There are always going to be at least 2, however the maximum number is unbounded. The first argument specifies a mode of operation and the second is a file to process. The 3rd through nth are the things to do to the file (which might be none, since the user might just want to clean the file, which is done if you just pass it 2 arguments).

I'm looking at the methods available to me in Perl for working with arrays, but I'm not sure what the "Perlish" way of iterating from item 3 to the end of my array is.

Some options that I've seen:

  • Pop from the end of the array until I find an element that does not begin with "-" (since the file path does not begin with a "-", although I suppose it could, which might cause problems).
  • Shift the array twice to remove the first two elements. Whatever I'm left with I can just iterate over, if its size is at least 1.

I like the second option, but I don't know if it's Perlish. And since I'm trying to learn Perl, I might as well learn the right way to do things in Perl.

Was it helpful?

Solution

Aside from using Getopt module as Sinan wrote, I would probably go with:

my ( $operation, $file, @things ) = @ARGV;

And then you can:

for my $thing_to_do ( @things ) {
...
}

OTHER TIPS

IMHO, the Perlish way of accomplishing what you need would be to use one of the Getopt modules on CPAN.

If you still want to do it by hand, I would go for the second option (this is similar to how we handle the first argument of a method call):

die "Must provide filename and operation\n" unless @ARGV >= 2;

my $op = shift @ARGV;
my $file = shift @ARGV;

if ( @ARGV ) {
    # handle the other arguments;
}

I would highly recommend using Getopt::Long for parsing command line arguments. It's a standard module, it works awesome, and makes exactly what you're trying to do a breeze.

use strict;
use warnings;
use Getopt::Long;

my $first_option = undef;
my $second_option = undef;

GetOptions ('first-option=s' => \$first_option, 
            'second-option=s' => \$second_option);

die "Didn't pass in first-option, must be xxxyyyzzz."
    if ! defined $first_option;
die "Didn't pass in second-option, must be aaabbbccc."
    if ! defined $second_option;

foreach my $arg (@ARGV) {
    ...
}

This lets you have a long option name, and automatically fills in the information into variables for you, and allows you to test it. It even lets you add extra commands later, without having to do any extra parsing of the arguments, like adding a 'version' or a 'help' option:

# adding these to the above example...
my $VERSION = '1.000';
sub print_help { ... }

# ...and replacing the previous GetOptions with this...
GetOptions ('first-option=s' => \$first_option, 
            'second-option=s' => \$second_option)
            'version' => sub { print "Running version $VERSION"; exit 1 },
            'help' => sub { print_help(); exit 2 } );

Then, you can invoke it on the command line using -, --, the first letter, or the entire option, and GetOptions figures it all out for you. It makes your program more robust and easier to figure out; it's more "guessable" you could say. The best part is you never have to change your code that processes @ARGV, because GetOptions will take care of all that setup for you.

The most standard way of doing things in Perl is through CPAN.

So my first choice would be Getopt::Long. There is also a tutorial on DevShed: Processing Command Line Options with Perl

You can use a slice to extract the 2nd. to last items, for example:

[dsm@localhost:~]$ perl -le 'print join ", ", @ARGV[2..$#ARGV];' 1 2 3 4 5 6 7 8 9 10 00
3, 4, 5, 6, 7, 8, 9, 10, 00
[dsm@localhost:~]$ 

however, you should probably be using shift (or even better, GetOpt::Long)

deepesz answer is one good way to go.

There is also nothing wrong with your second option:

my $op     = shift; # implicit shift from @ARGV
my $file   = shift; 
my @things = @ARGV;

# iterate over @things;

You could also skip copying @ARGV into @things and work directly on it. However, unless the script is very short, very simple, and unlikely to grow more complex over time, I would avoid taking too many short cuts.

Whether you choose deepesz' approach or this one is largely a matter of taste.

Deciding which is better is really a matter of philosophy. The crux of the issue is whether you should modify globals like @ARGV. Some would say it is no big deal as long as it is done in a highly visible way. Others would argue in favor of leaving @ARGV untouched.

Pay no attention to anyone arguing in favor of one option or the other due to speed or memory issues. The @ARGV array is limited by most shells to a very small size and thus no significant optimization is available by using one method over the other.

Getopt::Long, as has been mentioned is an excellent choice, too.

Do have a look at MooseX::Getopt because it may whet your appetite for even more things Moosey!.

Example of MooseX::Getopt:

# getopt.pl

{
    package MyOptions;
    use Moose;
    with 'MooseX::Getopt';

    has oper   => ( is => 'rw', isa => 'Int', documentation => 'op doc stuff' );
    has file   => ( is => 'rw', isa => 'Str', documentation => 'about file' );
    has things => ( is => 'rw', isa => 'ArrayRef', default => sub {[]} );

    no Moose;
}

my $app = MyOptions->new_with_options;

for my $thing (@{ $app->things }) {
    print $app->file, " : ", $thing, "\n";
}

# => file.txt : item1
# => file.txt : item2
# => file.txt : item3

Will produce the above when run like so:

perl getopt.pl --oper 1 --file file.txt --things item1 --things item2 --things item3


These Moose types are checked... ./getopt --oper "not a number" produces:

Value "not a number" invalid for option oper (number expected)

And for free you always get a usage list ;-)

usage: getopt.pl [long options...]
         --file         bit about file
         --oper         op doc stuff
         --things    

/I3az/

For the more general case with any array:

for(my $i=2; $i<@array; $i++) {
    print "$array[$i]\n";
}

That loops through the array, starting with the third element (index 2). Obviously, the specific example you specifiy, depesz's answer is the most straightforward and best.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top