Question

I would like to read a data grid (3D array of floats) from .xsf file. (format documentation is here http://www.xcrysden.org/doc/XSF.html the BEGIN_BLOCK_DATAGRID_3D block )

the problem is that data are in 5 columns and if the number of elements Nx*Ny*Nz is not divisible by 5 than the last line can have any length. For this reason I'm not able to use numpy.genfromtxt() of numpy.loadtxt() ...

I made a subroutine which does solve the problem, but is terribly slow ( because it use tight loops probably ). The files i want to read are large ( >200 MB 200x200x200 = 8000000 numbers in ASCII )

Is there any really fast way how to read such unfriendly formats in python / numpy into ndarray?


xsf datagrids looks like this (example for shape=(3,3,3))

BEGIN_BLOCK_DATAGRID_3D
 BEGIN_DATAGRID_3D_this_is_3Dgrid          
 3  3  3         # number of elements Nx Ny Nz                     
 0.0 0.0 0.0     # grid origin in real space                     
 1.0 0.0 0.0     # grid size in real space                    
 0.0 1.0 0.0                               
 0.0 0.0 1.0                          
   0.000  1.000  2.000  5.196  8.000   # data in 5 columns     
   1.000  1.414  2.236  5.292  8.062        
   2.000  2.236  2.828  5.568  8.246        
   3.000  3.162  3.606  6.000  8.544        
   4.000  4.123  4.472  6.557  8.944                   
   1.000  1.414                       # this is the problem
  END_DATAGRID_3D                      
 END_BLOCK_DATAGRID_3D                   

No correct solution

OTHER TIPS

I got something working with Pandas and Numpy. Pandas will fill in nan values for the missing data.

import pandas as pd
import numpy as np
df = pd.read_csv("xyz.data", header=None, delimiter=r'\s+', dtype=np.float, skiprows=7, skipfooter=2)
data = df.values.flatten()
data = data[~np.isnan(data)]
result = data.reshape((data.size/3, 3))

Output

>>> result
array([[ 0.   ,  1.   ,  2.   ],
       [ 5.196,  8.   ,  1.   ],
       [ 1.414,  2.236,  5.292],
       [ 8.062,  2.   ,  2.236],
       [ 2.828,  5.568,  8.246],
       [ 3.   ,  3.162,  3.606],
       [ 6.   ,  8.544,  4.   ],
       [ 4.123,  4.472,  6.557],
       [ 8.944,  1.   ,  1.414]])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top