Question

I wrote a script to import data from a (quite messy) datafile. Each line is read and processed separately in a loop.

I wrote the following code to skip the header and white lines:

for line in rd_file.readlines(): 
    line_1 = line.rstrip("\n")                                            
    # Decide what to do based on the content in the line.                    
    if "#" in line.lower(): 
        header_flag=True
        # Don't print the header  
        pass
    elif line.strip() == "":                                     
        pass
    else:       
        [...]

Running the script I noticed a memory leak. I located it using memory_profiler and I found out it is due to:

elif line.strip() == "": 
  pass 

This is what I get from memory_profiler:

45    204.5 MiB    160.6 MiB           elif line.strip() == ""

How is it possible that 160 MB get occupied just by skipping a blank line? Do you have any suggestion on how to fix this?

Was it helpful?

Solution

I recommend not invoking readlines(), but instead depend on the python file iterator pattern.

for line in rd_file:
    line_1 = line.rstrip("\n")   
    ...
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top