I have a function to escape HTML tags, to be able to insert text into HTML. Very similar to: Can I escape html special chars in javascript?

I know that Javascript use Unicode internally, but HTML pages may be encoded in different charsets like UTF-8 or ISO8859-1, etc..

My question is: There is any issue with this very simple conversion? or should I take into consideration the page charset?

If yes, how to handle that?

PS: For example, the equivalente PHP function (http://php.net/manual/en/function.htmlspecialchars.php) has a parameter to select a charset.

有帮助吗?

解决方案

No, JavaScript lives in the Unicode world so encoding issues are generally invisible to it. escapeHtml in the linked question is fine.

The only place I can think of where JavaScript gets to see bytes would be data: URLs (typically hidden beneath base64). So this:

 var markup = '<p>Hello, '+escapeHtml(user_supplied_data);
 var url = 'data:text/html;base64,'+btoa(markup);
 iframe.src = url;

is in principle a bad thing. Although I don't know of any browsers that will guess UTF-7 in this situation, a charset=... parameter should be supplied to ensure that the browser uses the appropriate encoding for the data. (btoa uses ISO-8859-1, for what it's worth.)

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top