Question

I'm fairly new to both python and the S3/Glacier integration interface that boto provides. However, I have discovered several flaws that seem to be particularly undocumented/unsolved, flaws that are seriously hindering my progress on my current work project.

My recent dilemma pertains to the restore() function within the boto library. Quite simply, it's not working at all. I had a suspicion for some time that the problem was related to an issue with the Key object's inconsistency of keeping track of storage_class's of data stored in S3 buckets. This page can be a resource as to some of the specifics of that problem: https://github.com/boto/boto/issues/1173

To elaborate on the Key consistency issue, consider the following scenario concerning an object that has already been archived to Glacier from S3:

from boto.s3.connection import S3Connection
from boto.s3.key import Key
...
conn = S3Connection(access_key_id, secret_key)
bucket = conn.get_bucket(bucket_name)
key_object = Key(bucket)

print bucket.get_key(filename).storage_class
...
key_object.key = filename
for item in bucket.list():
    if item.key == filename:
        break

print item.storage_class

A couple clarifications. I understand that the for loop that searches the bucket for a key is wildly inefficient, however that is precisely the mystery.

The first print statement will yield: u'STANDARD'

The second: u'GLACIER'

Moving forward now, I believe this inconsistency is affecting the efficacy of the restore() operation. If I attempt to key.restore(days=num_days) on either 'key' derivations I listed above, neither of them indicates that it had any affect in restoring objects from Glacier back to standard S3 accessibility. Furthermore, an attempt to restore returns None. At this point, I'm at a complete loss as to what could possibly explain this malfunction. Is it something programmatically wrong on my part? Or is there something innately broken about boto?

Any assistance you could provide me with would be greatly appreciated.

Thank you.

NOTE: I did not forget basic error checking i.e. does the file exist in the bucket? has the file already been restored? etc.

Was it helpful?

Solution 2

I figured out what my confusion was. Once an object is on Glacier, it remains on Glacier. It can be restored such that it can be downloaded, but its storage class will never change from Glacier back to S3 standard (to do that you must make a copy of the object and ensure that the copy is in standard storage).

The Key inconsistency remains an issue, but I've found some hack ways around it (which I would preferably avoid but, for now, I have no choice).

OTHER TIPS

If you're already keeping track of the filenames, have you tried following the doc's example via http://docs.pythonboto.org/en/latest/s3_tut.html#transitioning-objects-to-glacier

conn = S3Connection(access_key_id, secret_key)
bucket = conn.get_bucket(bucket_name)
key = bucket.get_key(filename)
key.restore(days=5)

use key.ongoing_restore to get the restore status

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top