The right way to check of a string has hebrew chars
Domanda
The Hebrew language has unicode representation between 1424 and 1514 (or hex 0590 to 05EA).
I'm looking for the right, most efficient and most pythonic way to achieve this.
First I came up with this:
for c in s:
if ord(c) >= 1424 and ord(c) <= 1514:
return True
return False
Then I came with a more elegent implementation:
return any(map(lambda c: (ord(c) >= 1424 and ord(c) <= 1514), s))
And maybe:
return any([(ord(c) >= 1424 and ord(c) <= 1514) for c in s])
Which of these are the best? Or i should do it differently?
Soluzione
You could do:
# Python 3.
return any("\u0590" <= c <= "\u05EA" for c in s)
# Python 2.
return any(u"\u0590" <= c <= u"\u05EA" for c in s)
Altri suggerimenti
Your basic options are:
- Match against a regex containing the range of characters; or
- Iterate over the string, testing for membership of the character in a string or set containing all of your target characters, and break if you find a match.
Only actual testing can show which is going to be faster.
Its simple to check the first character with unidcodedata:
import unicodedata
def is_greek(term):
return 'GREEK' in unicodedata.name(term.strip()[0])
def is_hebrew(term):
return 'HEBREW' in unicodedata.name(term.strip()[0])
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow