Question

I have a 27GB file that I am trying to move from an AWS Linux EC2 to S3. I've tried both the 'S3put' command and the 'S3cmd put' command. Both work with a test file. Neither work with the large file. No errors are given, the command returns immediately but nothing happens.

s3cmd put bigfile.tsv s3://bucket/bigfile.tsv
Was it helpful?

Solution

Though you can upload objects to S3 with sizes up to 5TB, S3 has a size limit of 5GB for an individual PUT operation.

In order to load files larger than 5GB (or even files larger than 100MB) you are going to want to use the multipart upload feature of S3.

http://docs.amazonwebservices.com/AmazonS3/latest/dev/UploadingObjects.html

http://aws.typepad.com/aws/2010/11/amazon-s3-multipart-upload.html

(Ignore the outdated description of a 5GB object limit in the above blog post. The current limit is 5TB.)

The boto library for Python supports multipart upload, and the latest boto software includes an "s3multiput" command line tool that takes care of the complexities for you and even parallelizes part uploads.

https://github.com/boto/boto

OTHER TIPS

The file did not exist, doh. I realised this after running the s3 commands in verbose mode by adding the -v tag:

s3cmd put -v bigfile.tsv s3://bucket/bigfile.tsv

s3cmd version 1.1.0 supports the multi-part upload as part of the "put" command, but its still in beta (currently.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top