Get inferred dataframe types iteratively using chunksize

Question

I didn't think it would be this intuitive, otherwise I wouldn't have posted the question. But once again, pandas makes things a breeze. However, keeping the question as this information might be useful to others working with large data:

In [1]: chunker = pd.read_csv('DATASET.csv', chunksize=500, header=0)

# Store the dtypes of each chunk into a list and convert it to a dataframe:

In [2]: dtypes = pd.DataFrame([chunk.dtypes for chunk in chunker])

In [3]: dtypes.values[:5]
Out[3]:
array([[int64, int64, int64, object, int64, int64, int64, int64],
       [int64, int64, int64, int64, int64, int64, int64, int64],
       [int64, int64, int64, int64, int64, int64, int64, int64],
       [int64, int64, int64, int64, int64, int64, int64, int64],
       [int64, int64, int64, int64, int64, int64, int64, int64]], dtype=object)

# Very cool that I can take the max of these data types and it will preserve the hierarchy:

In [4]: dtypes.max().values
Out[4]: array([int64, int64, int64, object, int64, int64, int64, int64], dtype=object)

# I can now store the above into a dictionary:

types = dtypes.max().to_dict()

# And pass it into pd.read_csv fo the second run:

chunker = pd.read_csv('tree_prop_dset.csv', dtype=types, chunksize=500)