Getting wrong zero values with numpy fromfile when reading binary files

https://stackoverflow.com//questions/25004225

20-12-2019
|

Question

I am trying to read a binary file with Python. This is the code I use:

fb = open(Bin_File, "r")
a = numpy.fromfile(fb, dtype=numpy.float32)

However, I get zero values at the end of the array. For example, for a case where nrows=296 and ncol=439 and as a result, len(a)=296*439, I get zero values for a[-922:]. I know these values should be noData (-9999 in this example) from a trusted piece of code in R. Does anybody know why I am getting these non-sense zeros?

P.S: I am not sure it is related on not, but len(a) is nrows*ncols+2! I have to get rid of these two using a = a[0:-2] so that when I reshape them into rows and columns using a_reshape = a.reshape(nrows, ncols) I don't get an error.

Solution

When opening a file for reading as binary you should use the mode "rb" instead of "r".

Here is some background from the docs. On linux machines you don't need the "b" but it wont hurt. On Windows machines you must use "rb" for binary files.

Also note that the two extra entries you're getting is a common bug/feature when using the "unformatted" binary output format of Fortran. Each write statement given in this mode will produce a record that is surrounded by two 4 byte blocks.

These blocks represent integers that list the number of bytes in the block of unformatted data. For example, [223] [223 bytes of data] [223].

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow