Question

I have a Nexus file (foo.nxs) with direct data from measurements and I wish to open it with pandas. However, when I try the typical

HDFStore('foo.nxs') or read_hdf('foo.nxs','/group')

It just returns an empty Store:

<class 'pandas.io.pytables.HDFStore'>
File path: /foo.nxs
Empty

or the TypeError:

TypeError: cannot create a storer if the object is not existing nor a value are passed

All examples in the docs page start by creating a hdf file, storing data in it and then retrieving it, but this is done from the same pandas. I want to know whether it's possible to read an hdf file which was not previously generated with pandas.

Here's a part of the output from ptdump, as requested by @Jeff:

/ (RootGroup) ''
  /._v_attrs (AttributeSet), 4 attributes:
    [HDF5_Version := '1.8.11',
    NeXus_version := '4.3.0',
    file_name := '/Messdaten/C9/20140423/messung_21h14m01.197.nxs',
    file_time := '2014-04-23T21:14:01+01:00']
/exp_root.Kerr.DoublePulse2D.SuperDuperScan_V2_00001 (Group) ''
  /exp_root.Kerr.DoublePulse2D.SuperDuperScan_V2_00001._v_attrs (AttributeSet), 1 attributes:
   [NX_class := 'NXentry']
/exp_root.Kerr.DoublePulse2D.SuperDuperScan_V2_00001/scan_data (Group) ''
  /exp_root.Kerr.DoublePulse2D.SuperDuperScan_V2_00001/scan_data._v_attrs (AttributeSet), 1 attributes:
   [NX_class := 'NXdata']
/exp_root.Kerr.DoublePulse2D.SuperDuperScan_V2_00001/scan_data/actuator_1_1 (CArray(5, 121), zlib(6)) ''
  atom := Float64Atom(shape=(), dflt=0.0)
  maindim := 0
  flavor := 'numpy'
  byteorder := 'little'
  chunkshape := (5, 121)
Was it helpful?

Solution

You have a plain vanilla PyTables store. So these are actually pretty easy to read (and don't need pandas), but this will show you how you can read them (as numpy arrays).

with pd.get_store('test.h5') as store:
     data = store.root.exp_root.Kerr.DoublePulse2D.SuperDuperScan_V2_00001/scan_data/actuator_1_1.read()

docs are here.

This just grabs the PyTables root node and reads the data in. This is in CArray format, and is a simple array.

HDFStore will be able to read plain-vanilla PyTables Table objects (with out the meta data), but simple enough to just read in arrays.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top