HDF5 Compound type Native vs. IEEE

Question 1

From http://www.hdfgroup.org/HDF5/doc/UG/UG_frame11Datatypes.html:

H5T_NATIVE_INT corresponds to a C int type. On an Intel based PC, this type is the same as H5T_STD_I32LE, while on a MIPS system this would be equivalent to H5T_STD_I32BE.

That's say, H5T_NATIVE_INT has different memory layout on different type of processors. If your data is only used in memory, which means your data will not go out of this machine, you may like to use H5T_NATIVE_INT for better performance.

But if your data will be saved to file, and will be used by different systems, you must specify a certain int type to keep your data can be read correctly, e.g. H5T_STD_I64BE or H5T_STD_I32LE. If you use H5T_NATIVE_INT, and your created a data file on Intel based PC, the number will be saved as H5T_STD_I32LE. When this file is used by a MIPS system, it will read the number as H5T_STD_I32BE, which is not expected.

Question 2

The other answer here is missing some key ideas and makes using HDF5 datatypes seem harder than it is.

To begin with, the NATIVE types are simply aliases for what the C types map to on that platform (this is detected when the HDF5 library is built). If you use them in your code and look at the file you created with the h5dump tool, you will not see the NATIVE datatype but will instead see the real datatype (H5T_STD_I32LE or whatnot). These NATIVE types are admittedly a little confusing, but they are convenient for mapping between C types and HDF5 datatypes without having to know the byte order of the system you are on.

The other misconception I want to clear up is that the library will convert types for you when it is reasonable to do so. If a dataset contains H5T_STD_I32BE values and you declare the I/O buffer to be of H5T_NATIVE_INT on a little-endian system, the HDF5 library will convert the big-endian dataset integers to in-memory little-endian integers for you. You should not need to perform byte swapping on your own.

Here is a simple way to think about it:

You declare a dataset's storage datatype when you call H5Dcreate().
You declare the I/O buffer's datatype when you call H5Dread() and H5Dwrite().

Again, if these differ and type conversions are reasonable, the data will be converted during the read/write calls.

Note that this type conversion could have performance implications in time-critical applications. If the platforms where data will be written and read differ in byte order or word size, you might want to explicitly set the datatype instead of using the NATIVE aliases so you can force the conversion to take place on the less important platform.

Example: Suppose you have a BE writer and LE reader and that the data arrive slowly but reads have to be as fast as possible. in this case, you would want to explicitly create your dataset to store H5T_STD_I32LE data so the datatype conversions happen on the writer.

One last thing -- It's better to use the HOFFSET(s,m) macro instead of calculating offsets by hand when constructing compound types. It's more maintainable and your code will look nicer.

If you want more information about HDF5 datatypes, check out chapter 6 of the user's guide here: https://support.hdfgroup.org/HDF5/doc/UG/HDF5_Users_Guide-Responsive%20HTML5/index.html

You can also check out the H5T API docs in the reference manual here: https://support.hdfgroup.org/HDF5/doc/RM/RM_H5Front.html