Question

If a file contains a £ (pound) sign then directory_iterator correctly returns the utf8 character sequence \xC2\xA3

wdirectory_iterator uses wide chars, but still returns the utf8 sequence. Is this the correct behaviour for wdirectory_iterator, or am I using it incorrectly?

AddFile(testpath, "pound£sign"); 
wdirectory_iterator iter(testpath);
TS_ASSERT_EQUALS(iter->leaf(),L"pound\xC2\xA3sign"); // Succeeds
TS_ASSERT_EQUALS(*iter, L"pound£sign"); // Fails
Was it helpful?

Solution

The encoding for wide chars (wchar_t objects) is implementation dependent. For the second statement (i.e. L"pound£sign") to work, you will probably need to change the underlying locale. The default is "C" which does not know about the pound character. The hex value succeeds since this does not require mapping the glyph to a value in a particular encoding.

Note: I am skipping the exact wording of the standard w.r.t wchar_t, extended character sets etc for brevity.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top