Error Handling: Boto: [Error 104] Connection Reset by Peer

Question 1

I had exactly the same problem. If you search boto on GitHub, you will see, we are not alone.

There's also a known accepted issue: https://github.com/boto/boto/issues/2207

Reaching performance limits of AWS S3

The truth is, that we got so used to boto and AWS S3 service, we have forgotten, these are really distributed systems, which might break in some cases.

I was archiving (download, tar, upload) huge number of files (about 3 years with around 15 feeds each having about 1440 versions a day) and using Celery to do this faster. And I have to say, that I was sometime getting these errors more often, probably reaching performance limits of AWS S3. These errors were often appearing in chunks (in my case I was uploading about 60 Mbps for couple of hours).

Training S3 performance

When I was measuring performance, it was "trained". After some hour, the responsiveness of S3 bucket jumped up, AWS have probably detected higher load and spin up some more instances serving it.

Try latest stable version of `boto`

Other thing is, that boto is trying to retry in many cases, so many failures are hidden to our calls. Sometime I got a bit better with upgrading to the latest stable version.

My conclusion are:

try upgrading to the latest stable boto
when error rate grows up, lower the pressure
accept the fact, that AWS S3 is distributed service having rare performance problems

In your code, I would definitely recommend adding some sleep, (at least 5, but 30 s would seem fine to me), otherwise you are just pushing harder and harder to a system, which might be in shaky situation at the moment.

Question 2

Well, it appeared the time.sleep() worked for a while. But, now that the files are bigger, that doesn't even do the trick. It seems like I need to restart the loop to get it working again. This modification seems to be working.

def download(filesToDownload):
    temp = []
    for sFile in filesToDownload:
        for keys in bucket.list(prefix='<bucket>%s' % (sFile)):
            while True:
                try:
                    keys.get_contents_to_filename('%s%s' % (downloadRoot,sFile))
                    temp.append(sFile)
                except:
                    time.sleep(30)
                    x = set(filesToDownload) - set(temp)
                    download(x)
                break

Question 3

I also ran into this. I had success adding in a retry.

client = boto3.client(
    's3',
    config=Config(retries={'max_attempts': 3})
)

Question 4

I once had this problem, and what fixed was creating a new access key, because the old one was compromised

Error Handling: Boto: [Error 104] Connection Reset by Peer

Reaching performance limits of AWS S3

Training S3 performance

Try latest stable version of boto

My conclusion are:

Try latest stable version of `boto`