Question

We have been using the following js/regex to find and replace all non-alphanumeric characters apart from - and +

outputString = outputString.replace(/[^\w|^\+|^-]*/g, "");

However it doesn't work entirely - it doesn't replace the ^ and | characters. I can't help but wonder if this is something to do with the ^ and | being used as meta-characters in the regex itself.

I've tried switching to use [\W|^+|^-], but that replaces the - and +. I thought that possibly a lookahead assertion may be the answer, but I'm not very sure how to implement them.

Has anyone got an idea how to accomplish this?

Was it helpful?

Solution

Character classes do not do alternation, hence why the | is literal, and the ^ must be at the start of the class to take effect (otherwise it's treated literally.)

Use this:

[^\w+-]+

(Also, if - is not last, it needs to be escaped as \- inside a character class - so be careful if more characters might be added to the exception list).

You could also do it with a negative lookahead like this:

(?![+-])\W

Note: You do not want a * or + after that \W, since the lookahead only applies to the immediately following character (and the g flag makes the replace repeat until done).

Also note that \w and \W consider _ as a word character. If that's not desired then to replace that you can use (?![+-])[\W_] (or use explicit ranges in the first expressions).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top