Is there a better way to sanitize input with javascript?

https://stackoverflow.com/questions/23187013

06-07-2023
|

Question

I wanted to write a javascript function to sanitize user input and remove any unwanted and dangerous characters.

It must allow only the following characters:

Alfanumeric characters (case insentitive): [a-z][0-9].
Inner whitespace, like "word1 word2".
Spanish characters (case insentitive): [áéíóúñü].
Underscore and hyphen [_-].
Dot and comma [.,].
Finally, the string must be trimmed with trim().

My first attempt was:

function sanitizeString(str){
str = str.replace(/[^a-z0-9áéíóúñü_-\s\.,]/gim,"");
return str.trim();
}

But if I did:

sanitizeString("word1\nword2")

it returns:

"word1
word2"

So I had to rewrite the function to remove explícitly \t\n\f\r\v\0:

function sanitizeString(str){
str = str.replace(/([^a-z0-9áéíóúñü_-\s\.,]|[\t\n\f\r\v\0])/gim,"");
return str.trim();
}

I'd like to know:

Is there a better way to sanitize input with javascript?
Why \n and \t doesn't matches in the first version RegExp?

Solution

The new version of the sanitizeString function:

function sanitizeString(str){
    str = str.replace(/[^a-z0-9áéíóúñü \.,_-]/gim,"");
    return str.trim();
}

The main problem was mentioned by @RobG and @Derek: (@RobG write your comment as an answer and I will accept it) \s doesn't mean what now w3Schools says

Find a whitespace character

It means what MDN says

Matches a single white space character, including space, tab, form feed, line feed. Equivalent to [ \f\n\r\t\v\u00a0\u1680\u180e\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u2028\u2029\u202f\u205f\u3000].

I trusted in w3Schools when I wrote the function.

A second change was to move the dash character (-) to the end in order to avoid it's range separator meaning.

Note 1: This is a server side validation using javascript.
Note 2: (for IBM Notes XPagers) I love javascript in XPages SSJS. This is simpler for me than the Java way.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow