Question

I've been using Apache's StringEscapeUtils for HTML entities, but if you want to escape HTML attribute values, is there a standard way to do this? I guess that using the escapeHtml function won't cut it, since otherwise why would the Owasp Encoder interface have two different methods to cope with this?

Does anyone know what is involved in escaping HTML attributes vs. entities and what to do about attribute encoding in the case that you don't have the Owasp library to hand?

Was it helpful?

Solution

It looks like this is Rule #2 of the Owasp's XSS Prevention Cheat Sheet. Note the bit where is says:

Properly quoted attributes can only be escaped with the corresponding quote

Therefore, I guess so long as the attributes are correctly bounded with double or single quotes and you escape these (i.e. double quote (") becomes " and single quote (') becomes ' (or ')) then you should be ok. Note that Apache's StringEscapeUtils.escapeHtml will be insufficient for this task since it does not escape the single quote ('); you should use the String's replaceAll method to do this.

Otherwise, if the attribute is written: <div attr=some_value> then you need to follow the recommendation on that page and..

escape all characters with ASCII values less than 256 with the &#xHH; format (or a named entity if available) to prevent switching out of the attribute

Not sure if there a non-Owasp standard implementation of this though. However, it guess it's good practice not to write attributes in this manner anyway!

Note that this is only valid when you are putting in a standard attribute values, if the attribute is a href or some JavaScript handler, then it's a different story. For examples of possible XSS scripting attacks that can occur from unsafe code inside event handler attributes see: http://ha.ckers.org/xss.html.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top