If you try to print unicode("\u0026abc")
, you will see the root of your problem:
>>> a = u"abc"
>>> ua = unicode("abc")
>>> a == ua
True
>>> b = u"\u0026abc"
>>> b
u'&abc'
>>> ub = unicode("\u0026abc")
>>> ub
u'\\u0026abc'
You can fix it this way:
>>> ub = unicode("&abc")
>>> ub
u'&abc'
>>> b == ub
True
But that required a human changing the code. To do so programmatically, you might try to do:
>>> c = "\u0026abc"
>>> c
'\\u0026abc'
>>> cc = "u\'" + c + "\'"
>>> cc
"u'\\u0026abc'"
>>> eval cc
>>> eval(cc)
u'&abc'
However, this solution is not much general, Daniel's answer provides better one.