Domanda

I have a problem when I try to parse a xml file containing a specific Kanji:

退

After debugging, I see that the problem is in this function of RapidXml :

struct text_pure_no_ws_pred
{
    static unsigned char test(Ch ch)
    {
        return internal::lookup_tables<0>::lookup_text_pure_no_ws[static_cast<unsigned char>(ch)];
    }
};


const unsigned char lookup_tables<Dummy>::lookup_text_pure_no_ws[256] = 
    {
      // 0   1   2   3   4   5   6   7   8   9   A   B   C   D   E   F
         0,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 0
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 1
         1,  1,  1,  1,  1,  1,  0,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 2
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  0,  1,  1,  1,  // 3
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 4
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 5
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 6
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 7
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 8
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // 9
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // A
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // B
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // C
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // D
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  // E
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1   // F
    };

where ch is the kanji 退. This function returns false. Why? With all the others characters, it returns true. Do you have any idea?

È stato utile?

Soluzione

It looks like Ch contains a Unicode value. static_cast<unsigned char>(0x9000) is 0.

You need a table that holds a lot more than 256 values.

Altri suggerimenti

RapidXML does not support full Unicode only UTF-8.

http://rapidxml.sourceforge.net/manual.html#namespacerapidxml_1character_types_and_encodings

See: Rapidxml and UTF8

The only options you have are: Convert the Kanji to UTF-8 and hope it works. Convert to non Unicode code-page and hope that that works with RapidXML.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top