You have HTML entities, simply use the HTMLParser
module to unescape those:
>>> import HTMLParser
>>> h = HTMLParser.HTMLParser()
>>> h.unescape("Hỗ trợ ngôn ngữ")
u'H\u1ed7 tr\u1ee3 ng\xf4n ng\u1eef'
>>> print h.unescape("Hỗ trợ ngôn ngữ")
Hỗ trợ ngôn ngữ
These HTML entities use decimal numbers, not hexadecimal. 7895
is 1ed7
in hexadecimal, etc. They encode unicode codepoints, no UTF-8 or ISO-8859-1 used. ISO-8859-1, or Latin-1, is not even capable of encoding these specific codepoints (Vietnamese for 'Language Support', according to Google Translate).