Вопрос

The h5py documentations (http://www.h5py.org/docs/high/dataset.html) says the following:

Importantly, h5py does not use NumPy to do broadcasting before the write...

>>> dset2 = f.create_dataset("MyDataset", (1000,1000,1000), 'f')
>>> data = np.arange(1000*1000, dtype='f').reshape((1000,1000))
>>> dset2[:] = data  # Does NOT allocate 3.8 G of memory

What doest broadcasting refers to in this case?

Это было полезно?

Решение

Here, broadcasting is copying the (1000,1000) array 1000 times, so it matches the (1000,1000,1000) shape.

H5py will not create the full array in memory first before writing to disk, instead it will write the (1000,1000) array 1000 times, creating the correct array on disk while only using 1/1000 of the memory.

You can read more about the rules of numpy broadcasting here.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top