What is the best method to read a double from a Binary file created in C?

https://stackoverflow.com/questions/631607

08-07-2019
|

Question

A C program spits out consecutive doubles into a binary file. I wish to read them into Python. I tried using struct.unpack('d',f.read(8))

EDIT: I used the following in C to write a random double number

r = drand48();
fwrite((void*)&r, sizeof(double), 1, data);

The Errors are now fixed but I cannot read the first value. for an all 0.000.. number it reads it as 3.90798504668055 but the rest are fine.

Solution

I think you are actually reading the number correctly, but are getting confused by the display. When I read the number from your provided file, I get "3.907985046680551e-14" - this is almost but not quite zero (0.000000000000039 in expanded form). I suspect your C code is just printing it with less precision than python is.

[Edit] I've just tried reading the file in C, and I get the same result (though slightly less precision: 3.90799e-14) (using printf("%g", val)), so I think if this value is incorrect, it's happened on the writing side, rather than the reading.

OTHER TIPS

Could you please elaborate on "didn't work"? Did the command crash? Did the data come out wrong? What actually happened?

If the command crashed:

Please share the error output of the command

If the data simply came out wrong:

Do the systems that create and read the data have the same endianness? If one is big-endian, and the other is little-endian, then you need to specify an endianness conversion in your format string.
If the endianness of the two computers are the same, how was the data written to the file, exactly? Do you know? If you do, then what was the value written to the file and what was the incorrect value you got out?

First, have you tried pickle? No one has shown any Python code yet... Here is some code for reading in binary in python:

import Numeric as N
import array
filename = "tmp.bin"
file = open(filename, mode='rb')
binvalues = array.array('f')
binvalues.read(file, num_lon * num_lat) 
data = N.array(binvalues, typecode=N.Float)   

file.close()

Where the f here specified single-precision, 4-byte floating, numbers. Find whatever size your data is per entry and use that.

For non binary data you could do something simple like this:

   tmp=[]
   for line in open("data.dat"):
                tmp.append(float(line))

f.read(8) might return less than 8 bytes

Data might have different alignment and/or endianness:

>>> for c in '@=<>':
...     print repr(struct.pack(c+'d', -1.05))
...
'\xcd\xcc\xcc\xcc\xcc\xcc\xf0\xbf'
'\xcd\xcc\xcc\xcc\xcc\xcc\xf0\xbf'
'\xcd\xcc\xcc\xcc\xcc\xcc\xf0\xbf'
'\xbf\xf0\xcc\xcc\xcc\xcc\xcc\xcd'
>>> struct.unpack('<d', '\xbf\xf0\xcc\xcc\xcc\xcc\xcc\xcd')
(-6.0659880001157799e+066,)
>>> struct.unpack('>d', '\xbf\xf0\xcc\xcc\xcc\xcc\xcc\xcd')
(-1.05,)

The best method would be to use an ASCII text file:

0.0
3.1416
3.90798504668055

in that it would be portable and work with any kind of floating point implementation to a degree.

Reading raw binary data from a double's memory address is not portable at all, and is bound to fail in some different implementation.

You may of course use a binary format for compactness, but a portable C function writing in that format would not look like your snippet at all.

At the very least, the code should be surrounded by a series of ifs/ifdefs checking that the memory representation of doubles used by the current machine exactly matches the one expected by the Python interpreter.

Writing such code would be difficult, which is why I'm suggesting the easy, clean, portable and human-readable solution of ASCII text.

This would be my definition of "best".

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow