I'm trying to process a German word list and can't figure out what encoding the file is in. The 'file' unix command says the file is "Non-ISO extended-ASCII text". Most of the words are in ascii, but here are the exceptions:

ANDR\x82
ATTACH\x82
C\x82ZANNE
CH\x83TEAU
CONF\x82RENCIER
FABERG\x82
L\x82VI-STRAUSS
RH\x93NETAL
P\xF2ANGE

Any hints would be great. Thanks!

EDIT: To be clear, the hex codes above are C hex string literals so replace \xXX with the literal hex value XX.

有帮助吗?

解决方案

It looks like CP437 or CP852, assuming the \x82 sequences encode single characters, and are not literally four characters. Well, at least everything else does, but the last line is a bit of a puzzle.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top