Вопрос

Following the excellent advice of a poster yesterday I started using the shutil.copyfileobj method to make a copy of a file.

My program should make an exact copy of the file, remove the last byte and save the new copy.

I tested it last night with some very small ASCII text files so I could check it was doing what I asked it too, I have tried it this morning on some actual 'complex' files, a PDF and a JPG and it looks like the copy function is not making a true copy. I looked at the resulting files in a hex editor, and I can see that after ~ offset 0x300 there is something odd going - either data is being added, or data is being changed on copy. I can not tell which.

My program iteratively takes off a byte and saves a new version, and I can see that the newly created files are consistently different to original file, (with the exception of the last byte)

def doNibbleAndSave(srcfile,fileStripped,strippedExt,newpath):
 counter = '%(interationCounter)03d' % {"interationCounter":interationCounter} #creates the filename counter lable
 destfile = newpath + "\\" + fileStripped + "_" + counter + strippedExt #creates the new filename 
 with open(srcfile, 'r') as fsrc:
  with open(destfile, 'w+') as fdest:
   shutil.copyfileobj(fsrc, fdest)
   fdest.seek(nibbleSize, os.SEEK_END) #sets the number of bytes to be removed
   fdest.truncate()
 srcfile = destfile #makes the iterator pick up the newly 'nibbled' file to work on next
 return (srcfile)

I can also see that the newly created objects are significantly smaller than the source file.

Это было полезно?

Решение

As you already noticed, you should open the files in binary mode; open(srcfile, "rb") and open(destfile, "wb+"). Otherwise, Python will assume the files are text-files and may do newline conversion, depending on the platform (see the tutorial for details).

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top