Question

Well, I tried and failed so, here I am again.

I need to match my abs path pattern.

 /public_html/mystuff/10000001/001/10/01.cnt

I am in taint mode etc..

#!/usr/bin/perl -Tw
use CGI::Carp qw(fatalsToBrowser);
use strict;
use warnings;
$ENV{PATH} = "bin:/usr/bin";
delete ($ENV{qw(IFS CDPATH BASH_ENV ENV)});

I need to open the same file a couple times or more and taint forces me to untaint the file name every time. Although I may be doing something else wrong, I still need help constructing this pattern for future reference.

my $file = "$var[5]";
if ($file =~ /(\w{1}[\w-\/]*)/) {
$under = "/$1\.cnt";
} else {
ErroR();
}

You can see by my beginner attempt that I am close to clueless.

I had to add the forward slash and extension to $1 due to my poorly constructed, but working, regex.

So, I need help learning how to fix my expression so $1 represents /public_html/mystuff/10000001/001/10/01.cnt

Could someone hold my hand here and show me how to make:

$file =~ /(\w{1}[\w-\/]*)/ match my absolute path /public_html/mystuff/10000001/001/10/01.cnt ?

Thanks for any assistance.

Was it helpful?

Solution

Edit: Using $ in the pattern (as I did before) is not advisable here because it can match \n at the end of the filename. Use \z instead because it unambiguously matches the end of the string.

Be as specific as possible in what you are matching:

my $fn = '/public_html/mystuff/10000001/001/10/01.cnt';

if ( $fn =~ m!
    ^(
        /public_html
        /mystuff
        /[0-9]{8}
        /[0-9]{3}
        /[0-9]{2}
        /[0-9]{2}\.cnt
     )\z!x ) {
     print $1, "\n";
 }

Alternatively, you can reduce the vertical space taken by the code by putting the what I assume to be a common prefix '/public_html/mystuff' in a variable and combining various components in a qr// construct (see perldoc perlop) and then use the conditional operator ?::

#!/usr/bin/perl

use strict;
use warnings;

my $fn = '/public_html/mystuff/10000001/001/10/01.cnt';
my $prefix = '/public_html/mystuff';
my $re = qr!^($prefix/[0-9]{8}/[0-9]{3}/[0-9]{2}/[0-9]{2}\.cnt)\z!;

$fn = $fn =~ $re ? $1 : undef;

die "Filename did not match the requirements" unless defined $fn;
print $fn, "\n";

Also, I cannot reconcile using a relative path as you do in

$ENV{PATH} = "bin:/usr/bin";

with using taint mode. Did you mean

$ENV{PATH} = "/bin:/usr/bin";

OTHER TIPS

You talk about untainting the file path every time. That's probably because you aren't compartmentalizing your program steps.

In general, I break up these sort of programs into stages. One of the earlier stages is data validation. Before I let the program continue, I validate all the data that I can. If any of it doesn't fit what I expect, I don't let the program continue. I don't want to get half-way through something important (like inserting stuff into a database) only to discover something is wrong.

So, when you get the data, untaint all of it and store the values in a new data structure. Don't use the original data or the CGI functions after that. The CGI module is just there to hand data to your program. After that, the rest of the program should know as little about CGI as possible.

I don't know what you are doing, but it's almost always a design smell to take actual filenames as input.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top