Currently there is no way to align data in PyTables :(
In practice I have done one of two things to get around this:
- I perform one extra step -->
np.require(sa, dtype=dt, requirements='ACO')
or - I arrange the fields in my dtype description such that they are aligned.
As an example for the 2nd option, suppose I have the following dtype:
dt = np.dtype([('f1', np.bool),('f2', '<i4'),('f3', '<f8')], align=True)
If you print dt.descr
you will see that a void space has been added to align the data:
dt.descr >>> [('f1', '|b1'), ('', '|V3'), ('f2', '<i4'), ('f3', '<f8')]
But, if I ordered my dtype like this (largest to smallest bytes):
dt = np.dtype([('f3', '<f8'), ('f2', '<i4'), ('f1', np.bool)])
The data is now aligned regardless of whether I specify align = True/False
.
Someone please correct me if I am wrong but even though dt.isalignedstruct = False
if it has been ordered as shown above it is technically aligned. This has worked for me in applications where I need to send aligned data to C.
In the example you provided, even though sa.dtype.isalignedstruct = False
given that
dt.descr = [('doc_id', '<u4'), ('word', '<u4'), ('tfidf', '<f4')]
and
sa.dtype.descr = [('doc_id', '<u4'), ('word', '<u4'), ('tfidf', '<f4')]
The sa
array is aligned (no void spaces added to the descr).