C++ File IO: Reading and Writing 16-bit Words
Pergunta
I want to write non-Unicode, 16-bit words to a file a read them back later. I know with a bit of byte manipulation I can do this in char
mode using fstream::read()
and fstream::write()
. What do I need to do to use 16-bit words directly?
For example, it seems I should be able to do the following:
basic_ofstream<uint16_t> aw;
aw.open("test.bin", ios::binary);
uint16_t c[] = {0x55aa, 0x1188};
aw.write(c, 2);
aw.close();
basic_ifstream<uint16_t> ax;
ax.open("test.bin", ios::binary);
uint16_t ui[2];
ax.read(ui, 2);
ax.close();
cout << endl << hex << unsigned(ui[0]) << " " << unsigned(ui[1]) << endl;
gcc 4.4 output:
d 0
Vc++10 output:
CCCC CCCC
I've also tried using std::basic_filebuf<uint16_t>
direct and got the same results. Why?
Solução
I'm actually surprised you got the streams instantiated to do any reading at all! What the result will be is possibly implementation defined (i.e., you might find the behavior described in the compiler's documentation) but possibly it is just not specified (although not quite undefined). I don't think the stream classes are required to support instantiations for other types than char
and wchar_t
immediately, i.e., without the user providing at least some of the facets.
The standard stream classes are templates on the character type but aren't easy to instantiate for any unsupported type. At bare minimum, you'd need to implement a suitable std::codecvt<int16_t, char, std::mbstate_t>
facet converting between the external representation in byte and the internal representation. From the looks of it the two systems you tried have made different choices for their default implementation.
std::codecvt<internT, externT, stateT>
is the facet used to convert between an external representation of characters and an internal representation of characters. Streams are only required to support char
which is considered to represent bytes as the external type externT
. The internal character type internT
can be any integral type but the conversion needs to be defined by implementing the code conversion facet. If I recall correctly, the streams can also assume that the state type stateT
is std::mbstate_t
(which is actually somewhat problematic because there is no interface defined for this type!).
Unless you are really dedicated in creating an I/O stream for your character type uint16_t
, you probably want to read bytes using std::ifstream
and convert them to your character type. Similarly for writing characters. To really create an I/O stream also supporting formatting, you'd need a number of other facets, too (e.g., std::ctype<uint16_t>
, std::num_punct<uint16_t>
) and you'd need to build a std::locale
to contain all of these plus a few which can be instantiated from the standard library's implementation (e.g., std::num_get<uint16_t>
and std::num_put<uint16_t>
; I think their iterator types are suitable defaulted).
Outras dicas
When I try your code, the file is written, but nothing is inside, its size is 0 after closing it. When reading from that file, nothing can be read. What you see in the output is uninitialized garbage.
Besides using ofstream/ifstream with default char you should not necessarily rely on read()
and write()
methods because they do not indicate if they actually write anything. Refer to http://en.cppreference.com/w/cpp/io/basic_ostream/write for details on this. Especially this is interesting:
This function is an unformatted output function: it begin execution by constructing an object of type sentry, which flushes the tie()'d output buffers if necessary and checks the stream errors. After construction, if the sentry object returns false, the function returns without attempting any output.
It is likely that this is why there is not output written to your file because it seems it is not designed to work with any other types than char or similar.
Update: To see if writing/reading succeed check the fail or bad bit which should have already indicated that something went wrong.
cout << aw.fail() << aw.bad() << "\n";
cout << ax.fail() << ax.bad() << "\n";
Both were set to true, so your real question should have been: why did the call to write()
fail?
I suggest reading: http://www.cplusplus.com/articles/DzywvCM9/
Snippets:
"The problem with these types is that their size is not well defined. int might be 8 bytes on one machine, but only 4 bytes on another. The only one that's consistent is char... which is guaranteed to always be 1 byte."
u16 ReadU16(istream& file)
{
u16 val;
u8 bytes[2];
file.read( (char*)bytes, 2 ); // read 2 bytes from the file
val = bytes[0] | (bytes[1] << 8); // construct the 16-bit value from those bytes
return val;
}
void WriteU16(ostream& file, u16 val)
{
u8 bytes[2];
// extract the individual bytes from our value
bytes[0] = (val) & 0xFF; // low byte
bytes[1] = (val >> 8) & 0xFF; // high byte
// write those bytes to the file
file.write( (char*)bytes, 2 );
}
You may want to refresh on the "typedef" keyword as well, for defining the guaranteed-#-bits types. While a little more of a learning curve, Boost and C99 compilers define guaranteed size types as well. I'm not sure about X++0x, but it's too new to be portable.
You can use char specializations and reinterpret_cast:
basic_ofstream<char> aw;
...
aw.write( reinterpret_cast<const char*>(i16buf), n2write*sizeof(int16_t) );
basic_ifstream<char> ax;
...
ax.read( reinterpret_cast<char*>(i16buf), n2read*sizeof(int16_t) );
The "sizeof(int16_t)" is for the edge cases where sizeof(int16_t)==1 (e.g. DSP processors)
Of course, if you need to read/write in a specific byte order, then you need endian conversion functions. Note, there is no standard compile-time way of determining endianness.