Remove the repr()
from your code. Use repr()
only to create debug output; you are turning a unicode value into a string that can be pasted back into the interpreter.
This means your line from the file is now stored as:
>>> repr(u'خُداوند خُداوند خُداوند\n').split(" ")
["u'\\u062e\\u064f\\u062f\\u0627\\u0648\\u0646\\u062f", '\\u062e\\u064f\\u062f\\u0627\\u0648\\u0646\\u062f', "\\u062e\\u064f\\u062f\\u0627\\u0648\\u0646\\u062f\\n'"]
Note the double backslashes (escaped unicode escapes) and the first string starts with u'
and the last string ends with \\n'
. These values are obviously never equal.
Remove the repr()
, and use .split()
without arguments to remove the trailing whitespace too:
lst = file_obj.readline().split()
and your code will work:
>>> res = u'خُداوند خُداوند خُداوند\n'.split()
>>> res[0] == res[1] == res[2]
True
You may need to normalize the input first; some characters can be expressed either as one unicode codepoint or as two combining codepoints. Normalizing moves all such characters to a composed or decomposed state. See Normalizing Unicode.