I'm using h5py to access HDF5 files and store the h5py File objects in a class. But I'm experiencing some strange behavior in attempting to reassign a closed h5py file instance variable with a new one:

class MyClass:
    def __init__(self, filename):
        self.h5file = None
        self.filename = filename

    def vartest(self):
        self.h5file = h5py.File(self.filename, 'r')
        print self.h5file
        self.h5file.close()
        print self.h5file
        newh5file = h5py.File(self.filename, 'r')
        print newh5file
        self.h5file = newh5file
        print self.h5file
        print newh5file

def main():
    filename = sys.argv[1]
    mycls = MyClass(filename)
    mycls.vartest()

Output:

<HDF5 file "test.h5" (mode r, 92.7M)>
<Closed HDF5 file>
<HDF5 file "test.h5" (mode r, 92.7M)>
<Closed HDF5 file>
<Closed HDF5 file>

Attempting to update the instance variable with the newly opened h5py File object appears to have somehow affected the state of the object, closing it. Regardless of the implementation on the h5py side, I don't see how this behavior makes sense from my understanding of the Python language (i.e., no overloading of the assignment operator).

This example is run with Python 2.6.5 and h5py 1.3.0. If you want to try this example but don't have an HDF5 file sitting around you can just change the file access mode from 'r' to 'a'.

有帮助吗?

解决方案

Yes, this is a known bug in h5py 1.3, which shows up when you use HDF5 1.8.5 or newer. It's related to changes in the way identifiers are handled in 1.8.5. You can fix it by using HDF5 1.8.4 or earlier, or by upgrading to h5py 2.0.

其他提示

Not sure if this will help, but searching through the source code I found this (abbreviated):

class HLObject(object):
    def __nonzero__(self):
        register_thread()
        return self.id.__nonzero__()

class Group(HLObject, _DictCompat):
    ...

class File(Group):
    def __repr__(self):
        register_thread()
        if not self:
            return "<Closed HDF5 file>"
        return '<HDF5 file "%s" (mode %s, %s)>' % \
            (os.path.basename(self.filename), self.mode,
             _extras.sizestring(self.fid.get_filesize()))

Because there is no __str__ method, __repr__ is called to produce the output, and __repr__ first calls register_thread(), then checks to see if self is alive (better known as evaluating to True or False).

Python then searches the classes until it finds __nonzero__ (which again calls register_thread()), then returns self.id.__nonzero__(), which is apparently returning False.

So, you are correct in that the issue is not with the name binding (assignment), but why register_thread and/or self.id is bombing out on you, I do not know.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top