The standard description of partial
return code of codecvt::do_out
says exactly this:
in Table 83:
partial
not all source characters converted
In 22.4.1.4.2[locale.codecvt.virtuals]/5:
Returns: An enumeration value, as summarized in Table 83. A return value of
partial
, if(from_next==from_end)
, indicates that either the destination sequence has not absorbed all the available destination elements, or that additional source elements are needed before another destination element can be produced.
In your case, not all (zero) source characters were converted, which technically says nothing of the contents of the output sequence (the 'if' clause in the sentence is not entered), but speaking generally, "the destination sequence has not absorbed all the available destination elements" here talks about valid multibyte characters. They are the elements of the multibyte character sequence produced by codecvt_utf8
.
It would be nice to have a more explicit standard wording, but here are two circumstantial pieces of evidence:
One: the old C's wide-to-multibyte conversion function std::wcsrtombs
(whose locale-specific variants are usually called by the existing implementations of codecvt::do_out
for system-supplied locales) is defined as follows:
Conversion stops [...] when the next multibyte character would exceed the limit of len total bytes to be stored into the array pointed to by dst.
And two, look at the existing implementations of codecvt_utf8
: you've already explored Microsoft's, and here's what's in libc++: codecvt_utf8::do_out
here calls ucs2_to_utf8
on Windows and ucs4_to_utf8
on other systems, and ucs2_to_utf8 does the following (comments mine):
else if (wc < 0x0800)
{
// not relevant
}
else // if (wc <= 0xFFFF)
{
if (to_end-to_nxt < 3)
return codecvt_base::partial; // <- look here
*to_nxt++ = static_cast<uint8_t>(0xE0 | (wc >> 12));
*to_nxt++ = static_cast<uint8_t>(0x80 | ((wc & 0x0FC0) >> 6));
*to_nxt++ = static_cast<uint8_t>(0x80 | (wc & 0x003F));
}
nothing is written to the output sequence if it cannot fit a multibyte character that results from consuming one input wide character.