Question

In the last week I have noticed that about 5-10% of the gz files I copy down using GSUtils (now v3.42) are corrupt... When I look at the files from the GCS UI, sure enough they are larger than what GSUtils downloaded.

(FileNameHere).gz: Unexpected end of archive
(FileNameHere).gz: CRC failed in (FileNameHere). The file is corrupt

The use case is in copying gzip files from GCS down to one of our Windows Server 2008r2 machines.

Has anyone else seen this problem?

Was it helpful?

Solution

Can you please provide us with a specific example: Complete bucket & object name, the specific date/time when you downloaded the object, and the size of the file after downloading using gsutil? That way we can investigate and try to reproduce the case you're seeing.

If you'd prefer not to post the specific bucket and object names on StackOverflow you can communicate privately with the GCS team by emailing gs-team@google.com

Thanks,

Mike

OTHER TIPS

This snippet goes along with the comments from above (retries the copy command until successful):

#!/bin/sh

export PATH=${PATH}:/cygdrive/c/gsutil
ZIPFOLDER="d:/YourPathHere"
for obj in \
  gs://YourBucketName/YourFileName_01.gz \
  gs://YourBucketName/YourFileName_02.gz \
  gs://YourBucketName/YourFileName_03.gz \
...
  gs://YourBucketName/YourFileName_NN.gz \ ; do
    until gsutil cp $obj "$ZIPFOLDER" ; do :; done
done
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top