I am trying to use Python to column stack and row stack data I have in an HDF5 file with additional data. I am recording images from a camera and saving them to individual files. Then I want to be able to generate a single file with all of the images patched together. Therefore, I would like to be able to make one dataset in a new file and stack together all of the arrays from each image file into the single file.

I know that h5py allows me to use the datasets like numPy arrays, but I do not know how to tell h5py to save the data to the file again. Below I have a very simple example.

My question is how can I column stack the data from the HDF5 file with the second array (arr2) such that arr2 is saved to the file?

(Note: In my actual application, the data in the file will be much larger than in the example. Therefore, importing the data into the memory, column stacking, and then rewriting it to the file is out of the question.)

import h5py
import numpy

arr1 = numpy.random.random((2000,2000))

with h5py.File("Plot0.h5", "w") as f:
    dset = f.create_dataset("Plot", data = arr1)

arr2 = numpy.random.random((2000,2000))

with h5py.File("Plot0.h5", "r+") as f:
    dset = f["Plot"]
    dset = numpy.column_stack((dset, arr2))

It seems like a trivial issue, but all of my searches have been unsuccessful. Thanks in advance.

有帮助吗?

解决方案

After rereading some of the documentation on H5py, I realized my mistake. Here is my new script structure that allows me to stack arrays in the HDF5 file:

import h5py
import numpy

arr1 = numpy.random.random((2000,2000))

with h5py.File("Plot0.h5", "w") as f:
    dset = f.create_dataset("Plot", data = arr1, maxshape=(None,None))

dsetX, dsetY = 2000,2000
go = ""
while go == "":
    go = raw_input("Current Size: " + str(dsetX) + "  " + str(dsetY) + "  Continue?")
    arr2 = numpy.random.random((2000,2000))

    with h5py.File("Plot0.h5", "r+") as f:
        dset = f["Plot"]
        print len(arr2[:])
        print len(arr2[0][:])
        change = "column"

        dsetX, dsetY = dset.shape

        if change == "column":

            x1 = dsetX
            x2 = len(arr2[:]) + dsetX

            y1 = 0
            y2 = len(arr2[0][:])

            dset.shape = (x2, y2)
        else:
            x1 = 0
            x2 = len(arr2[:])

            y1 = dsetY
            y2 = len(arr2[0][:]) + dsetY

            dset.shape = (x2, y2)
        print "x1", x1
        print "x2", x2
        print "y1", y1
        print "y2", y2

        print dset.shape

        dset[x1:x2,y1:y2] = arr2

        print arr2
        print "\n"
        print dset[x1:x2,y1:y2]

        dsetX, dsetY = dset.shape

I hope this can help someone else. And of course, better solutions to this problem are welcome.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top