Convert the Chinese Characters From ISO-8859-1 To UTF-8

Question 1

The current text encoding of that string is rather insubstantial. What you have there are HTML entities; they have little to do with the underlying "physical" encoding like ISO-8859 or UTF-8. What you want is to decode those HTML entities into a byte representation of the characters in a specific encoding, in this case to UTF-8. Therefore:

echo html_entity_decode('&#36830;&#34915;&#35033;', ENT_COMPAT, 'UTF-8');
// 连衣裙

Question 2

You need to use:

utf8_encode($data);

and not decode,to convert your current ISO-8859-1 to UTF-8.

Some native PHP functions such as strtolower(), strtoupper() and ucfirst() do not always function correctly with UTF-8 strings. Possible solutions: convert to latin first or add the following line to your code:

setlocale(LC_CTYPE, 'C');

Make sure not to save your PHP files using a BOM (Byte-Order Marker) UTF-8 file marker (your browser might show these BOM characters between PHP pages on your site).

Just for your reference:

ISO-8859-1 => Albanian, Brazilian, Catalan, Danish, Dutch, English, Finnish, French, German, Portuguese, Norwegian, Spanish, Swedish

UTF-8 => Chinese (simplified), Chinese (traditional), Japanese, Persian

Question 3

There are many tools that can convert character references to characters, and writing such a tool is rather straightforward, especially if you know the references are all decimal. So the answer really depends on the software environment.

For example, to do such a conversion for an individual HTML document, you could use the BabelPad editor: command Convert → Numeric Character References (NCR) → NCR to Unicode, and save the result as UTF-8.