How is $_ different from named input or loop arguments?

https://stackoverflow.com/questions/5405423

perl
perlvar

29-10-2019
|

Question

As I use $_ a lot I want to understand its usage better. $_ is a global variable for implicit values as far as I understood and used it.

As $_ seems to be set anyway, are there reasons to use named loop variables over $_ besides readability?

In what cases does it matter $_ is a global variable?

So if I use

for (@array){
    print $_;
}

or even

print $_ for @array;

it has the same effect as

for my $var (@array){
    print $var;
}

But does it work the same? I guess it does not exactly but what are the actual differences?

Update:

It seems $_ is even scoped correctly in this example. Is it not global anymore? I am using 5.12.3.

#!/usr/bin/perl
use strict;
use warnings;

my @array = qw/one two three four/;
my @other_array = qw/1 2 3 4/;

for (@array){
    for (@other_array){
        print $_;
    }
    print $_;
}

that prints correctly 1234one1234two1234three1234four.

For global $_ I would have expected 1234 4 1234 4 1234 4 1234 4 .. or am i missing something obvious?

When is $_ global then?

Update:

Ok, after having read the various answers and perlsyn more carefully I came to a conclusion:

Besides readability it is better to avoid using $_ because implicit localisation of $_ must be known and taken account of otherwise one might encounter unexpected behaviour.

Thanks for clarification of that matter.

Solution

are there reasons to use named loop variables over $_ besides readability?

The issue is not if they are named or not. The issue is if they are "package variables" or "lexical variables".

See the very good description of the 2 systems of variables used in Perl "Coping with Scoping":

http://perl.plover.com/FAQs/Namespaces.html

package variables are global variables, and should therefore be avoided for all the usual reasons (eg. action at a distance).

Avoiding package variables is a question of "correct operation" or "harder to inject bugs" rather than a question of "readability".

In what cases does it matter $_ is a global variable?

Everywhere.

The better question is:

In what cases is $_ local()ized for me?

There are a few places where Perl will local()ize $_ for you, primarily foreach, grep and map. All other places require that you local()ize it yourself, therefore you will be injecting a potential bug when you inevitably forget to do so. :-)

OTHER TIPS

The classic failure mode of using $_ (implicitly or explicitly) as a loop variable is

for $_ (@myarray) {
  /(\d+)/ or die;
  foo($1);
}

sub foo {
  open(F, "foo_$_[0]") or die;
  while (<F>) {
    ...
  }
}

where, because the loop variable in for/foreach is bound to the actual list item, means that the while (<F>) overwrites @myarray with lines read from the files.

$_ is the same as naming the variable as in your second example with the way it is usually used. $_ is just a shortcut default variable name for the current item in the current loop to save on typing when doing a quick, simple loop. I tend to use named variables rather than the default. It makes it more clear what it is and if I happen to need to do a nested loop there are no conflicts.

Since $_ is a global variable, you may get unexpected values if you try to use its value that it had from a previous code block. The new code block may be part of a loop or other operation that inserts its own values into $_, overwriting what you expected to be there.

The risk in using $_ is that it is global (unless you localise it with local $_), and so if some function you call in your loop also uses $_, the two uses can interfere.

For reasons which are not clear to me, this has only bitten me occasionally, but I usually localise $_ if I use it inside packages.

There is nothing special about $_ apart from it is the default parameter for many functions. If you explicitly lexically scope your $_ with my, perl will use the local version of $_ rather than the global one. There is nothing strange in this, it is just like any other named variable.

sub p { print "[$_]"; } # Prints the global $_
# Compare and contrast
for my $_ (b1..b5) { for my $_ (a1..a5) { p } } print "\n"; # ex1
for my $_ (b1..b5) { for       (a1..a5) { p } } print "\n"; # ex2
for       (b1..b5) { for my $_ (a1..a5) { p } } print "\n"; # ex3
for       (b1..b5) { for       (a1..a5) { p } } print "\n"; # ex4

You should be slightly mystified by the output until you find out that perl will preserve the original value of the loop variable on loop exit (see perlsyn).

Note ex2 above. Here the second loop is using the lexically scoped $_ declared in the first loop. Subtle, but expected. Again, this value is preserved on exit so the two loops do not interfere.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow