Question

The EBS documentation states:

As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% – 0.5%, where failure refers to a complete loss of the volume.

..but this doesn't give any indication of the AFR for a volume with, for example:

  • No snapshot at all; or
  • A fresh snapshot with no modified data.

I've seen it suggested that missing or damaged blocks can be automatically/silently recovered from snapshots but I can't see any reference to this in the documentation. Is this true?

Can I assume that if I have a volume with no changed data and a fresh snapshot, my AFR for the volume matches S3's reliability?

Was it helpful?

Solution 2

Snapshots taken of EBS Volumes are stored in S3. These snapshots get all the durability and availability benefits of S3. You can also copy snapshots to other regions, which is a nice insurance policy against a regional level outage.

If your EBS volume fails, you can then recover from your last snapshot. The more recent your snapshot, the more up-to-date your recovery story is. With the incremental nature of EBS snapshots performing them on a frequent basis is very practical.

EBS also provides "recovery volumes", which you can see from this AWS forum thread.

To my knowledge, the act of taking a snapshot doesn't directly impact the AFR of an active, running EBS volume. Rather, it just makes it easier for you to recover in the event of a failure.

OTHER TIPS

I took a three day class from AWS last year, and they told us unequivocally that taking snapshots greatly increases the reliability of an EBS volume. They did not explain why that was so, but hinted that EBS volumes store changes from the latest snapshot and that the snapshot itself is very stable (stored in S3). Successive snapshots apparently use little storage, as AWS is smart enough to store diffs.

They did not give any hard numbers on failure rates, though. They suggested configuring multiple EBS volumes using RAID if reliability of the volume is essential. However, they also recommended architecting your application so that it can tolerate failure of any instance, making it less important for each EBS volume to be durable.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top