How to assign only 16 bits to any integer in a binary file instead of the normal 32 in C++?

StackOverflow https://stackoverflow.com/questions/19638125

  •  01-07-2022
  •  | 
  •  

Question

I have a program to create a compressed file using LZW algorithm and employing hash tables. My compressed file currently contains integers corresponding to the index of hashtable. The maximum integer in this compressed file is around 46000, which can easily be represented by 16 bits. Now when i convert this "compressedfile.txt" to a binary file "binary.bin"(to further reduce the file size) using the following code, I get 32 bit integers in my "binary.bin" file. E.g. if there is a number 84 in my compressed file, it converts to 5400 0000 in my binary file.

std::ifstream in("compressedfile.txt");
std::ofstream out("binary.bin", ios::out | std::ios::binary);

int d;
while(in >> d)
{out.write((char*)&d, 4);}

My question is can't I discard the ending '0000' in '5400 0000' which uses up an extra 2 bytes in my file. This is the case with every integer since my max integer is 46000 which can be represented using only 2 bytes. Is there any code that can set the base of my binary file that way? I hope my question is clear.

Was it helpful?

Solution

It's writing exactly what you tell it to, 4 bytes at the address of d (an integer, 32 bit on many platforms). Use a 16 bit type and write 2 bytes instead:

uint16_t d; // unsigned to ensure it's large enough to hold your max value of 46000
while (in >> d) out.write(reinterpret_cast<char*>(&d), sizeof d);

Edit: As pointed out in the comments, for this code and the data it generates to be portable across processor architectures you should pick an endianness convention for the output. I'd suggest using htons() to convert your uint16_t to network byte order which is widely available, though not (yet) part of the C++ standard.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top