My python code is receiving a byte array which represents the bytes of the hdf5 file.

I'd like to read this byte array to an in-memory h5py file object without first writing the byte array to disk. This page says that I can open a memory mapped file, but it would be a new, empty file. I want to go from byte array to in-memory hdf5 file, use it, discard it and not to write to disk at any point.

Is it possible to do this with h5py? (or with hdf5 using C if that is the only way)

有帮助吗?

解决方案

You could try to use Binary I/O to create a File object and read it via h5py:

f = io.BytesIO(YOUR_H5PY_STREAM)
h = h5py.File(f,'r')

其他提示

You can use io.BytesIO or tempfile to create h5 objects, which showed in official docs http://docs.h5py.org/en/stable/high/file.html#python-file-like-objects.

The first argument to File may be a Python file-like object, such as an io.BytesIO or tempfile.TemporaryFile instance. This is a convenient way to create temporary HDF5 files, e.g. for testing or to send over the network.

tempfile.TemporaryFile

>>> tf = tempfile.TemporaryFile()
>>> f = h5py.File(tf)

or io.BytesIO

"""Create an HDF5 file in memory and retrieve the raw bytes

This could be used, for instance, in a server producing small HDF5
files on demand.
"""
import io
import h5py

bio = io.BytesIO()
with h5py.File(bio) as f:
    f['dataset'] = range(10)

data = bio.getvalue() # data is a regular Python bytes object.
print("Total size:", len(data))
print("First bytes:", data[:10])

The following example uses tables which can still read and manipulate the H5 format in lieu of H5PY.

import urllib.request
import tables
url = 'https://s3.amazonaws.com/<your bucket>/data.hdf5'
response = urllib.request.urlopen(url) 
h5file = tables.open_file("data-sample.h5", driver="H5FD_CORE",
                          driver_core_image=response.read(),
                          driver_core_backing_store=0)
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top