How to read fortran 77 unformatted binary file into python

https://stackoverflow.com/questions/23377274

12-07-2023
|

Question

I have a script in IDL which reads an unformatted binary file (F77) and outputs it as a .sav file but I want to convert this script to python and save as an .npz file and I am having trouble on the read line.

IDL Code:

;create save file for QBO model output

;---------------------------------------------------------------------
;input data here (Only adjust this info)
;---------------------------------------------------------------------

  time=7300  ;duration from fortran code
  tstep=5   ;time step from fortran code
  inputfile='/home/cwilliams/Metr_205/qbo/uvwtom.'; binary file from fortran
  outputsavefile='~cwilliams/Metr_205/qbo/qbo_FULLX.sav'; name of save file with     variables
;--------------------------------------------------------------------------------
;--------------------------------------------------------------------------------





;==============================================================================
;   Do Not Adjust the Following Code
;==============================================================================
fname='dat'
nstep=time/tstep ;time/timestep
nm=nstep-1

um=fltarr(17,34)
vm=fltarr(17,34)
wm=fltarr(17,34)
tm=fltarr(17,34)
pm=fltarr(17,34)
om=fltarr(17,34)
vpm=fltarr(17,34)
vdm=fltarr(17,34)


u=fltarr(17,34,nstep)
v=fltarr(17,34,nstep)
w=fltarr(17,34,nstep)
temp=fltarr(17,34,nstep)
pressure=fltarr(17,34,nstep)
ozone=fltarr(17,34,nstep)

date=findgen(nstep)*(2.*tstep/365.)
ulevels2=[-30, -25, -20, -15, -10, -5, 5, 10, 15,  20, 25, 30]
ulevels3=findgen(15)*3.-12
lab=findgen(20)*0+1
lat=findgen(17)*2.7-1.35

ht=findgen(34)*0.5+17.


openr,37,inputfile+fname,/f77_unformatted & title='base'
for n=0,nm do begin
    readu,37,um,vm,wm,tm,om,pm
    u(*,*,n)=um/100.
    v(*,*,n)=vm
    w(*,*,n)=wm
    temp(*,*,n)=tm*700000./2.87e6
    pressure(*,*,n)=pm/10000.
    ozone(*,*,n)=om
endfor
close,37

save,filename=outputsavefile,u,v,w,temp,pressure,ozone,date,ulevels2,ulevels3,lab,$
lat,ht
stop
end

Python Code: #create save file for QBO model output

#---------------------------------------------------------------------
#input data here (Only adjust this info)
#---------------------------------------------------------------------

time=7300  #duration from fortran code
tstep=5   #time step from fortran code
inputfile='/home/cwilliams/metr51/lab12/uvwtom.dat'# binary file from fortran
outputsavefile='~cwilliams/metr51/lab12/qbo_FULLX.sav'# name of save file with variables
#--------------------------------------------------------------------------------
#--------------------------------------------------------------------------------


import numpy as n


#==============================================================================
#   Do Not Adjust the Following Code
#==============================================================================
nstep=time/tstep #time/timestep
nm=nstep-1

um=n.empty((17,34))
vm=n.empty((17,34))
wm=n.empty((17,34))
tm=n.empty((17,34))
pm=n.empty((17,34))
om=n.empty((17,34))
vpm=n.empty((17,34))
vdm=n.empty((17,34))


u=n.empty((17,34,nstep))
v=n.empty((17,34,nstep))
w=n.empty((17,34,nstep))
temp=n.empty((17,34,nstep))
pressure=n.empty((17,34,nstep))
ozone=n.empty((17,34,nstep))

date=n.arange(nstep)*(2.*tstep/365.)
ulevels2=n.array([-30, -25, -20, -15, -10, -5, 5, 10, 15,  20, 25, 30])
ulevels3=n.arange(15)*3.-12
lab=n.arange(20)*0+1
lat=n.arange(17)*2.7-1.35

ht=n.arange(34)*0.5+17.

This is where I need some assitance:

f=open(inputfile,'rb')     
data=f.read()
for i in range(nm+1):
#    readu,37,um,vm,wm,tm,om,pm
#    u(:,:,i)=um/100.
#    v(:,:,i)=vm
#    w(:,:,i)=wm
#    temp(:,:,i)=tm*700000./2.87e6
#    pressure(:,:,i)=pm/10000.
#    ozone(*,*,n)=om
#
#n.savez(filename=outputsavefile,u=u,v=v,w=w,temp=temp,pressure=pressure,ozone=ozone,date=date,ulevels2=ulevels2,ulevels3=ulevels3,lab=lab,lat=lat,ht=ht)

I know there is an issue with the row/column order between IDL and Python but I think I can fix this once I read the code in.

Solution

First off, I have to assume that you wrote the original Fortran file using sequential access rather than direct access, this is a very important distinction. I do know IDL and the read command you have set up in your example is consistent with sequential access.

This matters because Fortran sequential access adds 'record markers' to the file for every call to WRITE(), so you'll have to account for that in your Python read routine, much as in the answer that george linked in the comments.

As to row/column order, it really just matters which dimension is iterating the fastest in memory. A Fortran 2D array, for example, will write in memory with the first array dimension as the fastest iterating. When you read the same thing into a 2D Python variable, it will be the last dimension - keep that in mind when reshaping a flat array, and you'll have to transpose it if you want to keep your indexing.

I think the following code will do approximately what you want, for brevity I just included your um and vm variables:

import numpy as np

um=np.empty((34,17), dtype='float32') # Make these dimensions "backwards" for easier reshaping
vm=np.empty((34,17), dtype='float32') # Also watch out, I believe the default type is float64

f = open(inputfile,'rb')
recl = np.zeros(1,dtype=np.uint32)
for i in range(nm+1):
    recl = np.fromfile(f, dtype='uint32', count=1)
    tmpu = np.fromfile(f, dtype='float32', count=um.size) # These arrays will be flat
    tmpv = np.fromfile(f, dtype='float32', count=vm.size)
    recl = np.fromfile(f, dtype='uint32', count=1)

    um = np.transpose(np.reshape(tmpu, um.shape))
    vm = np.transpose(np.reshape(tmpv, vm.shape))

Then you may also do whatever you wish in the loop with your other variables, and you should be able to index um and vm like you are used to in IDL or Fortran. This might not be the easiest or best way to do it, but I think it will at least get you started.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow