Question

as hdf5 documentation says, HDF5 stores data using NumPy

"It is built on top of the HDF5 library, the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code, makes it a fast yet extremely easy-to-use tool for interactively storing and retrieving very large amounts of data"

...

"PyTables uses these NumPy containers as in-memory buffers to push the I/O bandwith towards the platform limits."

So what's the mechanism? How does PyTables are using NumPy?In the end, they generate plain hdf5 accessible from other languages...

Was it helpful?

Solution

HDF5 is a C language library. HDF5 stores numbers, including floats, in a platform independent manner (scroll down to the table titled "Examples of Native Datatypes and Corresponding C Types," there's more information in the Users Guide).

PyTables simply converts from the HDF5 datatype to a NumPy datatype. And it mixes Python code and native code to reduce I/O overhead.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top