Question

I'm trying to tidy up some client data. Several entries have this as a URL:

http://not available

I thought I'd skip over those (and other potential mis-matches) with Regexp::Common, but for some reason, a URL with an unescaped space matches $RE{URI}{HTTP}:

$ perl -MRegexp::Common='URI' -e 'my $url = q{http://not available}; print "yes\n" if $url =~ m#$RE{URI}{HTTP}#'
yes

I've seen the '{-nospace}' flag mentioned for other regexes, but appending it doesn't seem to apply/work here, either.

Am I interpreting things wrong? Are spaces allowed in http URLs in some context that I'm unaware of? Is there a way to force the regex to disallow it?

Était-ce utile?

La solution

The substring http://not is a valid URL. If you want to check that a given string is an URL (not: that it merely contains an URL), you must anchor the match:

/\A$RE{URI}{HTTP}\z/
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top