Question

I am using the following method to HtmlEncode some text that it's in Spanish, like this:

string word = "configuración";
string encodedWord = System.Net.WebUtility.HtmlEncode(word);

The output is the expected:

configuración

But! the ó text represents the HTML entity number for a latin small letter "o" with acute.

However, I want to know if there is a way - using a built-in function which I don't know, library, etc - to show the HTML entity name of the HTML entity number and also support other characters (like a generic solution).

What I've tried so far is to check for a HTML entities table (there were many when Googling but I used this one: http://www.ascii.cl/htmlcodes.htm) then created a custom method for replacing the needed string from the word by doing some mapping.

So, if the word contains ó then the matching text will be replaced to it's HTML entity name which is oacute; but it is really painful since there are plenty of cases/scenarios.

Finally, the desired output will be:

configuración
Was it helpful?

Solution

HtmlEncode(word); does only encode ISO 8859-1 (Latin-1). Which means your input needs to be encoded in ISO 8859-1. The ó is not in the iso standard, you can try to use the AntiXss encoder:

Microsoft.Security.Application.AntiXss.HtmlEncode("ó"); 

or Microsoft.Security.Application.Encoder.HtmlEncode("ó");
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top