Question

I want to compress big text files with python (I am talking about >20Gb files). I am not any how an expert so I tried to gather the info I found and the following seems to work :

import bz2

with open('bigInputfile.txt', 'rb') as input:
    with bz2.BZ2File('bigInputfile.txt.bz2', 'wb', compresslevel = 9) as output:
        while True:
            block = input.read(900000)
                if not block:
                    break
                output.write(block)

input.close()
output.close()

I am wondering if this syntax is correct and if there is a way to optimize it ? I have an impression that I am missing something here.

Many thanks.

Was it helpful?

Solution

Your script seems correct, but can be abbreviated:

from shutil import copyfileobj

with open('bigInputfile.txt', 'rb') as input:
    with bz2.BZ2File('bigInputfile.txt.bz2', 'wb', compresslevel=9) as output:
        copyfileobj(input, output)

OTHER TIPS

Why are you calling the .close() methods? They are not needed as you use the with: statement.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top