I would like to maintain a large PyTable in a hdf5 file.
Normally as new data comes I would append to the existing table:
store = pd.HDFStore(path_to_dataset, 'a')
store.append("data", newdata)
store.close()
However, if the columns of old stored data and those of the incoming newdata are partially only overlapping, it is returned the following error:
Exception: cannot match existing table structure for [col1,col2,col3] on appending data
In these cases, I would like to get a behavior similar to the normal DataFrame append function
which fills non overlapping entries with NAN
import pandas as pd
a = {"col1":range(10),"col2":range(10)}
a = pd.DataFrame(a)
b = {"b1":range(10),"b2":range(10)}
b = pd.DataFrame(b)
a.append(b)
Is it possible have a similar operation "in memory", or do I need to create a completely new file?