I never liked File::Find
because it just is a mess. It swallows up your entire program because it wants everything to be in your wanted subroutine. Plus, I don't like the fact that half of my code is scattered all over the place. However, what other tools come standard with every installation of Perl. I have to make do.
I prefer to toss all of my files into an array. It keeps the code clean. My find
just finds. I do the rest of my processing elsewhere. I also embed my wanted subroutine embedded in my find command. It keeps everything in one place.
Also, you can't use unlink
to remove a directory. Use remove_tree
from File::Path. That's a standard module. You can also use readdir
to see how many subdirectories a directory has. That's a good way to check to see if it's empty:
use strict;
use warnings;
use feature qw(say);
use File::Find;
use File::Path qw(make_path remove_tree);
my $testdir = 'C:/jason/temp/test';
my $mdate_limit = 30;
my @files; # We'll store the files here
my %dirs; # And we'll track the directories that my be empty
#
# First find the files
#
find ( sub {
return unless -f; # We want just files.
return if -M < $mdate_limit; # Skip if we've modified since $mdate_limit days
push @files, $File::Find::name; # We're interested in this file,
$dirs{$File::Find::dir} = 1; # and the directory that file is in
}, $testdir );
#
# Delete the files that you've found
#
unlink @files;
#
# Go through the directories and see which are empty
#
for my $dir ( sort keys %dirs ) {
opendir my $dir_fh, $dir or next; # We'll skip bad reads
my @dir_files = readdir $dir_fh;
close $dir_fh;
if ( @dir_files <= 2 ) { # Directory is empty if there's only "." and ".." in it
remove_tree( $dir )
or warn qq(Can't remove directory "$dir"\n);
}
}
Notice that I've embedded my wanted
routine:
find ( sub {
return unless -d; # We want just files.
return if -M < $mdate_limit; # File hast been modified in the $mdate_limit days
push @files, $Find::File::name; # We're interested in this file
$dirs{$Find::File::dir} = 1; # The directory that file is in
}, $testdir );
The alternative is this:
file (\&wanted, $testdir);
sub wanted {
return unless -d; # Okay...
return if -M < $mdate_limit; # Um... Where's $mdate_limit defined?
push @files, $Find::File::name; # And @files?
$dirs{$Find::File::dir} = 1; # And %dirs?
}
The problem is that my wanted
subroutine contains three global variables. And, it's possible for my find
command to get separated from my wanted
subroutine. In 3 months time, you'll have to search all over your code to find that wanted
routine.
And, when you do see that wanted
subroutine, there are those three mysterious global variables. Where are they defined? Is that a bug?
By combining the subroutine with my find, I guarantee that the subroutine the find
command needs won't drift away from my find
. Plus, it hides the globalness of those three variables embedded in my subroutine.
There is nothing preventing me from deleting the files inside the find command. It's usually not a good idea to change the directory structure while searching it, but this should be fine.
However, I like my find
command to just find the files I'm interested in. I don't want 1/2 of my program stuffed in there. It becomes a maintenance nightmare. I'll put up with a bit of inefficiency. It might take a full second or two to load my @files
array with a million of files, but I'll spend a lot longer than that as soon as I have to debug my program.