If the page is encoded in GB2312, and your script (the file itself) is encoded in utf-8, there's no way the match will work. Because .find()
will look for utf-8 codepoints, and it will just slide over the characters you're looking for, because they're not encoded the same way...
开 奖 结 果
GB bfaa bdb1 bde1 b9fb
UTF-16 5f00 5956 7ed3 679c
UTF-8 e5bc80 e5a596 e7bb93 e69e9c