JSON.parse escaping for anti-XSS and any character insertion into html, attributes, and values from WebSockets

StackOverflow https://stackoverflow.com/questions/22273198

Question

I've read the cheat sheet, but I'm still unsure how exactly my data should be escaped to protect against XSS while allowing any valid character to be inserted into HTML, attributes, and variable values. Potential variable values are regexed before being put into any function like parseInt. Data is only received via a WebSocket connection.

Is JSON.parse safe to call on any string? If not, how must the data be made safe via javascript or at least tested to see if it doesn't conform?

When should the HTML and attribute escaping be done relative to JSON.parse?

Was it helpful?

Solution

You need to feed vaild json data into JSON parse functions. Typically whatever creates the json string needs to create valid json and therefore that's what needs to escape the html (and other) characters.

And if you google, 'how to escape json' you'll get a lot of sites which show how to do it halfway.

Most will point out a small group of chars and say do this:

\b  Backspace (ascii code 08)
\f  Form feed (ascii code 0C)
\n  New line
\r  Carriage return
\t  Tab
\v  Vertical tab
\'  Apostrophe or single quote
\"  Double quote
\\  Backslash caracter

This is partially correct. You need to escape:

These characters get escaped by using \u + hexadecimal ie. "\u002F"

Here's the spec: http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf

Personally, I only use the \u-hex notation for all escape sequences and I never worry about if the json might be used inside of the JavaScript context.

Json and Javascript are pretty cool in that you could escape every char as \u+hex if you wanted which makes XXS pretty much impossible (especially when inside of double quotes).

Update:

Keep in mind escaping the json is only 1 part of a complete XSS safe site. You still need to worry about how the json string might be used, as it could be passed into a function as an argument or you might create an array with the data or you might place the string inside of an document.getElementById('xyz').innerHTML(json.data)

So if the json data stays in the javascript context, you're safe to use \u+hex escaping.

When the string is moving into the html context, you need to treat it as html:

 document.getElementById('xyz').innerHTML(json.data) //oh-no: now it will in html context

So you need to convert the JSON data with a function like this:

var __entityMap = {
    "&": "&",
    "<": "&lt;",
    ">": "&gt;",
    '"': '&quot;',
    "'": '&#39;',
    "/": '&#x2F;'
};

String.prototype.toHtml = function() {
    return String(this).replace(/[&<>"'\/]/g, function (s) {  
        return __entityMap[s];
    });
}

So now you can do this:

document.getElementById('xyz').innerHTML(json.data.toHtml() ) //ok -- now safe for html
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top