Question

According to the Perl documentation on file globbing, the <*> operator or glob() function, when used in a scalar context, should iterate through the list of files matching the specified pattern, returning the next file name each time it is called or undef when there are no more files.

But, the iterating process only seems to work from within a loop. If it isn't in a loop, then it seems to start over immediately before all values have been read.

From the Perl docs:

In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted.

http://perldoc.perl.org/functions/glob.html

However, in scalar context the operator returns the next value each time it's called, or undef when the list has run out.

http://perldoc.perl.org/perlop.html#I/O-Operators

Example code:

use warnings;
use strict;

my $filename;

# in scalar context, <*> should return the next file name
# each time it is called or undef when the list has run out

$filename = <*>;
print "$filename\n";
$filename = <*>;      # doesn't work as documented, starts over and
print "$filename\n";  # always returns the same file name
$filename = <*>;
print "$filename\n";

print "\n";

print "$filename\n" while $filename = <*>; # works in a loop, returns next file
                                           # each time it is called

In a directory with 3 files...file1.txt, file2.txt, and file3.txt, the above code will output:

file1.txt
file1.txt
file1.txt

file1.txt
file2.txt
file3.txt

Note: The actual perl script should be outside the test directory, or you will see the file name of the script in the output as well.

Am I doing something wrong here, or is this how it is supposed to work?

Was it helpful?

Solution

Here's a way to capture the magic of the <> glob operator's state into an object that you can manipulate in a normal sort of way: anonymous subs (and/or closures)!

sub all_files {
    return sub { scalar <*> };
}

my $iter = all_files();
print $iter->(), "\n";
print $iter->(), "\n";
print $iter->(), "\n";

or perhaps:

sub dir_iterator {
    my $dir = shift;
    return sub { scalar glob("$dir/*") };
}
my $iter = dir_iterator("/etc");
print $iter->(), "\n";
print $iter->(), "\n";
print $iter->(), "\n";

Then again my inclination is to file this under "curiosity". Ignore this particular oddity of glob() / <> and use opendir/readdir, IO::All/readdir, or File::Glob instead :)

OTHER TIPS

The following code also seems to create 2 separate instances of the iterator...

for ( 1..3 )
{
   $filename = <*>;
   print "$filename\n" if defined $filename;
   $filename = <*>;
   print "$filename\n" if defined $filename;
}

I guess I see the logic there, but it is kind of counter intuitive and contradictory to the documentation. The docs don't mention anything about having to be in a loop for the iteration to work.

Also from perlop:

A (file)glob evaluates its (embedded) argument only when it is starting a new list.

Calling glob creates a list, which is either returned whole (in list context) or retrieved one element at a time (in scalar context). But each call to glob creates a separate list.

(Scratching away at my rusty memory of Perl...) I think that multiple lexical instances of <*> are treated as independent invokations of glob, whereas in the while loop you are invoking the same "instance" (whatever that means).

Imagine, for instance, if you did this:

while (<*>) { ... }
...
while (<*>) { ... }

You certainly wouldn't expect those two invocations to interfere with each other.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top