numpy.packbits
to turn it into a uint8
array for writing, then numpy.unpackbits
after reading it back. numpy.packbits
pads the axis you're packing along with zeros to get to a multiple of 8, so make sure you keep track of how many zeros you'll need to chop off the end when you unpack the array.
Writing numpy.bool array to compact file?
Question
I'm using numpy and Python 2.7 to compute large (100 million+ elements) boolean arrays for a super-massive prime sieve and write them to binary files to read at a much later time. NumPy bools are 8-bit, so the file size that I'm writing is much larger than necessary. Since I'm writing a large number of these files I'd like to keep them as small as humanly possible without having to waste a lot of time/memory converting them to a bitarray and back.
I was originally going to switch to using the bitarray module to keep file size down, but the sieve computation time increased by around 400% with the same algorithms, which is a bit unacceptable. Is there a fast-ish way to write and read back the ndarray in a smaller file, or is this a trade-off that I'm just going to have to deal with?
La solution