Is 'shift' evil for processing Perl subroutine parameters?

https://stackoverflow.com/questions/1328411

19-09-2019
|

Question

I'm frequently using shift to unpack function parameters:

sub my_sub {
    my $self = shift;
    my $params = shift;
    ....
}

However, many on my colleagues are preaching that shift is actually evil. Could you explain why I should prefer

sub my_sub {
    my ($self, $params) = @_;
    ....
}

to shift?

Solution

The use of shift to unpack arguments is not evil. It's a common convention and may be the fastest way to process arguments (depending on how many there are and how they're passed). Here's one example of a somewhat common scenario where that's the case: a simple accessor.

use Benchmark qw(cmpthese);

sub Foo::x_shift { shift->{'a'} }
sub Foo::x_ref   { $_[0]->{'a'} }
sub Foo::x_copy  { my $s = $_[0]; $s->{'a'} }

our $o = bless {a => 123}, 'Foo';

cmpthese(-2, { x_shift => sub { $o->x_shift },
               x_ref   => sub { $o->x_ref   },
               x_copy  => sub { $o->x_copy  }, });

The results on perl 5.8.8 on my machine:

            Rate  x_copy   x_ref x_shift
x_copy  772761/s      --    -12%    -19%
x_ref   877709/s     14%      --     -8%
x_shift 949792/s     23%      8%      --

Not dramatic, but there it is. Always test your scenario on your version of perl on your target hardware to find out for sure.

shift is also useful in cases where you want to shift off the invocant and then call a SUPER:: method, passing the remaining @_ as-is.

sub my_method
{
  my $self = shift;
  ...
  return $self->SUPER::my_method(@_);
}

If I had a very long series of my $foo = shift; operations at the top of a function, however, I might consider using a mass copy from @_ instead. But in general, if you have a function or method that takes more than a handful of arguments, using named parameters (i.e., catching all of @_ in a %args hash or expecting a single hash reference argument) is a much better approach.

OTHER TIPS

It is not evil, it is a taste sort of thing. You will often see the styles used together:

sub new {
    my $class = shift;
    my %self = @_;
    return bless \%self, $class;
}

I tend to use shift when there is one argument or when I want to treat the first few arguments differently than the rest.

This is, as others have said, a matter of taste.

I generally prefer to shift my parameters into lexicals because it gives me a standard place to declare a group of variables that will be used in the subroutine. The extra verbosity gives me a nice place to hang comments. It also makes it easy to provide default values in a tidy fashion.

sub foo {
    my $foo = shift;       # a thing of some sort.
    my $bar = shift;       # a horse of a different color.
    my $baz = shift || 23; # a pale horse.

    # blah
}

If you are really concerned about the speed of calling your routines, don't unpack your arguments at all--access them directly using @_. Be careful, those are references to the caller's data you are working with. This is a common idiom in POE. POE provides a bunch of constants that you use to get positional parameters by name:

sub some_poe_state_handler {
    $_[HEAP]{some_data} = 'chicken';
    $_[KERNEL]->yield('next_state');
}

Now the big stupid bug you can get if you habitually unpack params with shift is this one:

sub foo {
    my $foo = shift;
    my $bar = shift;
    my @baz = shift;

    # I should really stop coding and go to bed.  I am making dumb errors.

}

I think that consistent code style is more important than any particular style. If all my coworkers used the list assignment style, I'd use it too.

If my coworkers said there was a big problem using shift to unpack, I'd ask for a demonstration of why it is bad. If the case is solid, then I'd learn something. If the case is bogus, I could then refute it and help stop the spread of anti-knowledge. Then I'd suggest that we determine a standard method and follow it for future code. I might even try to set up a Perl::Critic policy to check for the decided upon standard.

Assigning @_ to a list can bring some helpul addtional features.

It makes it slightly easier to add additional named parameters at a later date, as you modify your code Some people consider this a feature, similar to how finishing a list or a hash with a trailing ',' makes it slightly easier to append members in the future.

If you're in the habit of using this idiom, then shifting the arguments might seem harmful, because if you edit the code to add an extra argument, you could end up with a subtle bug, if you don't pay attention.

e.g.

sub do_something {
 my ($foo) = @_;
}

later edited to

sub do_something {
  my ($foo,$bar) = @_; # still works
}

however

sub do_another_thing {
  my $foo = shift;
}

If another colleague, who uses the first form dogmatically (perhaps they think shift is evil) edits your file and absent-mindedly updates this to read

sub do_another_thing {
  my ($foo, $bar)  = shift; # still 'works', but $bar not defined
}

and they may have introduced a subtle bug.

Assigning to @_ can be more compact and efficient with vertical space, when you have a small number of parameters to assign at once. It also allows for you to supply default arguments if you're using the hash style of named function parameters

e.g.

 my (%arguments) = (user=>'defaultuser',password=>'password',@_);

I would still consider it a question of style / taste. I think the most important thing is to apply one style or the other with consistency, obeying the principle of least surprise.

I don't think shift is evil. The use of shift shows your willingness to actually name variables - instead of using $_[0].

Personally, I use shift when there's only one parameter to a function. If I have more than one parameter, I'll use the list context.

 my $result = decode($theString);

 sub decode {
   my $string = shift;

   ...
 }

 my $otherResult = encode($otherString, $format);

 sub encode {
   my ($string,$format) = @_;
   ...
 }

Perl::Critic is your friend here. It follows the the "standards" set up in Damian Conway's book Perl Best Practices. Running it with --verbose 11 gives you an explanation on why things are bad. Not unpacking @_ first in your subs is a severity 4 (out of 5). E.g:

echo 'package foo; use warnings; use strict; sub quux { my foo= shift; my (bar,baz) = @_;};1;' | perlcritic -4 --verbose 11

Always unpack @_ first at line 1, near 'sub quux { my foo= shift; my (bar,baz) = @_;}'.
  Subroutines::RequireArgUnpacking (Severity: 4)
    Subroutines that use `@_' directly instead of unpacking the arguments to
    local variables first have two major problems. First, they are very hard
    to read. If you're going to refer to your variables by number instead of
    by name, you may as well be writing assembler code! Second, `@_'
    contains aliases to the original variables! If you modify the contents
    of a `@_' entry, then you are modifying the variable outside of your
    subroutine. For example:

       sub print_local_var_plus_one {
           my ($var) = @_;
           print ++$var;
       }
       sub print_var_plus_one {
           print ++$_[0];
       }

       my $x = 2;
       print_local_var_plus_one($x); # prints "3", $x is still 2
       print_var_plus_one($x);       # prints "3", $x is now 3 !
       print $x;                     # prints "3"

    This is spooky action-at-a-distance and is very hard to debug if it's
    not intentional and well-documented (like `chop' or `chomp').

    An exception is made for the usual delegation idiom
    `$object->SUPER::something( @_ )'. Only `SUPER::' and `NEXT::' are
    recognized (though this is configurable) and the argument list for the
    delegate must consist only of `( @_ )'.

There is an optimization for list assignment.

The only reference I could find, is this one.

5.10.0 inadvertently disabled an optimization, which caused a measurable performance drop in list assignment, such as is often used to assign function parameters from @_ . The optimisation has been re-instated, and the performance regression fixed.

This is an example of the affected performance regression.

sub example{
  my($arg1,$arg2,$arg3) = @_;
}

It isn't intrinsically evil, but using it to pull off the arguments of a subroutine one by one is comparatively slow and requires a greater number of lines of code.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow