Most efficient way of loading formatted binary files in Python

https://stackoverflow.com/questions/703262

python
binaryfiles
input

22-08-2019
|

Question

I have binary files no larger than 20Mb in size that have a header section and then a data section containing sequences of uchars. I have Numpy, SciPy, etc. and each library has different ways of loading in the data. Any suggestions for the most efficient methods I should use?

Solution

struct should work for the header section, while numpy's memmap would be efficient for the data section if you are going to manipulate it in numpy anyways. There's no need to stress out about being inconsistent here. Both methods are compatible, just use the right tool for each job.

OTHER TIPS

Use the struct module, or possibly a custom module written in C if performance is critical.

bdec seems promising.

I found that array.fromfile is the fastest methods for homogeneous data.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow