There currently is a discrepancy between the documentation as quoted in the question, and the actual implementation as of perl 5.18.1. The problem are character classes. The documentation mentions \w
,\s
,\W
,\S
in what sounds like an exhaustive list, while the implementation taints on pretty much every use of […]
.
The right solution would probably be somewhere in between: character classes like [[:word:]]
should taint, since it depends on locale. My fixed list should not. Character ranges like [a-z]
depend on collation, so in my personal opinion they should taint as well. \d
depends on what a locale considers a digit, so it, too, should taint even if it is neither one of the escape sequences mentioned so far nor a bracketed class.
So in my opinion, both the documentation and the implementation need fixing. Perl devs are working on this. For progress information, please look at the perl bug report I filed.
For a fixed list of characters, one viable workaround appears to be a formulation as a disjunction, i.e. (?:\.|_)
instead of [._]
. It is more verbose, but should work even with the current (in my opinion buggy) perl versions.