The typical use-case is when a regex needs to include user input. Characters with special meaning in regex (i.e. "the dirty dozen" in Perl) need to be escaped. Perl provides the "quotemeta" functionality to do this: simply encapsulate interpolating variables in \Q and \E. But Tcl provides no such functionality (and according to this page, even with ARE).

Is there a good (rigorous) implementation of quotemeta in Tcl out there?

有帮助吗?

解决方案

Perl's quotemeta function simply replaces every non-word character (i.e., characters other than the 26 lowercase letters, the 26 uppercase letters, the 10 digits, and underscore) with a backslash. This is overkill, since not all non-word characters are regexp metacharacters, but it's simple and safe, since escaping a non-word character that doesn't need escaping is harmless.

I believe this implementation is correct:

proc quotemeta {str} {
    regsub -all -- {[^a-zA-Z0-9_]} $str {\\&} str
    return $str
}

But thanks to glenn's comment, this one is better, at least for modern versions of Tcl (\W matches any non-word character starting some time after Tcl 8.0.5):

proc quotemeta {str} {
    regsub -all -- {\W} $str {\\&} str
    return $str
}

(I'm assuming that Tcl's regular expressions are similar enough to Perl's so that this will do the same job in Tcl that it does in Perl.)

其他提示

I'll propose a solution, but I'm not confident it's correct.

#
#   notes
#
#   -  "[]" has to appear in the beginning of a character class
#   -  "-" has to come last in a character class
#   -  "#" is not special, but anticipating the x modifier...
#   -  "-" is not special, but anticipating interpolation within "[]"...
#   -  "/" is not special in Tcl
#
proc quotemeta {str} {
    regsub -all -- {[][#$^*()+{}\|.?-]} $str {\\\0} str
    return $str
}
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top