Question

I am writing a compression program, and need to write bit data to a binary file using c++. If anyone could advise on the write statement, or a website with advice, I would be very grateful.

Apologies if this is a simple or confusing question, I am struggling to find answers on web.

Was it helpful?

Solution

Collect the bits into whole bytes, such as an unsigned char or std::bitset (where the bitset size is a multiple of CHAR_BIT), then write whole bytes at a time. Computers "deal with bits", but the available abstraction – especially for IO – is that you, as a programmer, deal with individual bytes. Bitwise manipulation can be used to toggle specific bits, but you're always handling byte-sized objects.

At the end of the output, if you don't have a whole byte, you'll need to decide how that should be stored. Both iostreams and stdio can write unformatted data using ostream::write and fwrite, respectively.

Instead of a single char or bitset<8> (8 being the most common value for CHAR_BIT), you might consider using a larger block size, such as an array of 4-32, or more, chars or the equivalent sized bitset.

OTHER TIPS

For writing binary, the trick I have found most helpful is to store all the binary as a single array in memory and then move it all over to the hard drive. Doing a bit at a time, or a byte at a time, or an unsigned long long at a time is not as fast as having all the data stored in an array and using one instance of "fwrite()" to store it to the hard drive.

size_t fwrite ( const void * ptr, size_t size, size_t count, FILE * stream );

Ref: http://www.cplusplus.com/reference/clibrary/cstdio/fwrite/

In English:

fwrite( [array* of stored data], [size in bytes of array OBJECT. For unsigned chars -> 1, for unsigned long longs -> 8], [number of instances in array], [FILE*])

Always check your returns for validation of success!

Additionally, an argument can be made that having the object type be as large as possible is the fastest way to go ([unsigned long long] > [char]). While I am not versed in the coding behind "fwrite()", I feel the time to convert from the natural object used in your code to [unsigned long long] will take more time when combined with the writing than the "fwrite()" making due with what you have.

Back when I was learning Huffman Coding, it took me a few hours to realize that there was a difference between [char] and [unsigned char]. Notice for this method that you should always use unsigned variables to store the pure binary.

by below class you can write and read bit by bit

class bitChar{
public:
    unsigned char* c;
    int shift_count;
    string BITS;

    bitChar()
    {
        shift_count = 0;
        c = (unsigned char*)calloc(1, sizeof(char));
    }

    string readByBits(ifstream& inf)
    {
        string s ="";
        char buffer[1];
        while (inf.read (buffer, 1))
        {
            s += getBits(*buffer);
        }
        return s;
    }

    void setBITS(string X)
    {
        BITS = X;
    }

    int insertBits(ofstream& outf)
    {
        int total = 0;

        while(BITS.length())
        {
            if(BITS[0] == '1')
                *c |= 1;
            *c <<= 1;
            ++shift_count;
            ++total;
            BITS.erase(0, 1);

            if(shift_count == 7 )
            {
                if(BITS.size()>0)
                {
                    if(BITS[0] == '1')
                        *c |= 1;
                    ++total;
                    BITS.erase(0, 1);
                }

                writeBits(outf);
                shift_count = 0;
                free(c);
                c = (unsigned char*)calloc(1, sizeof(char));
            }
        }

        if(shift_count > 0)
        {
            *c <<= (7 - shift_count);
            writeBits(outf);
            free(c);
            c = (unsigned char*)calloc(1, sizeof(char));
        }
        outf.close();
        return total;
    }

    string getBits(unsigned char X)
    {
        stringstream itoa;
        for(unsigned s = 7; s > 0 ; s--)
        {
            itoa << ((X >> s) & 1);
        }

        itoa << (X&1) ;
        return itoa.str();
    }

    void writeBits(ofstream& outf)
    {
        outf << *c;
    }

    ~bitChar()
    {
        if(c)
            free(c);
    }
};

for example

#include <iostream>
#include <sstream>
#include <fstream>
#include <string> 
#include <stdlib.h>
using namespace std;


int main()
{
    ofstream outf("Sample.dat");
    ifstream inf("Sample.dat");

    string enCoded = "101000001010101010";

    //write to file
    cout << enCoded << endl ; //print  101000001010101010
    bitChar bchar;
    bchar.setBITS(enCoded);
    bchar.insertBits(outf);

     //read from file
    string decoded =bchar.readByBits(inf);
    cout << decoded << endl ; //print 101000001010101010000000
    return 0;
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top