encrypting and/or decrypting large files (AES) on a memory and storage constrained system, with “catastrophe recovery”

StackOverflow https://stackoverflow.com/questions/11300128

سؤال

I have a fairly generic question, so please pardon if it is a bit vague.

So, let's a assume a file of 1GB, that needs to be encrypted and later decrypted on a given system.

Problem is that the system has less than 512 mb of free memory and about 1.5 GB storage space (give or take), so, with the file "onboard" we have about ~500 MB of "hard drive scratch space" and less than 512 mb RAM to "play with".

The system is not unlikely to experience an "unscheduled power down" at any moment during encryption or decryption, and needs to be able to successfully resume the encryption/decryption process after being powered up again (and this seems like an extra-unpleasant nut to tackle).

The questions are:

1) is it at all doable :) ?

2) what would be the best strategy to go about

a) encrypting/decrypting with so little scratch space (can't have the entire file lying around while decrypting/encrypting, need to truncate it "on the fly" somehow...)

and

b) implementing a disaster recovery that would work in such a constrained environment?

P.S.: The cipher used has to be AES.

I looked into AES-CTR specifically but it does not seem to bode all that well for the disaster recovery shenanigan in an environment where you can't keep the entire decrypted file around till the end...

[edited to add] I think I'll be doing it the Iserni way after all.

هل كانت مفيدة؟

المحلول

It is doable, provided you have a means to save the AES status vector together with the file position.

  1. Save AES status and file position P to files STAGE1 and STAGE2
  2. Read one chunk (say, 10 megabytes) of encrypted/decrypted data
  3. Write the decrypted/encrypted chunk to external scratch SCRATCH
  4. Log the fact that SCRATCH is completed
  5. Write SCRATCH over the original file at the same position
  6. Log the fact that SCRATCH has been successfully copied
  7. Goto 1

If you get a hard crash after stage 1, and STAGE1 and STAGE2 disagree, you just restart and assume the stage with the earliest P to be good. If you get a hard crash during or after stage 2, you lose 10 megabytes worth of work: but the AES and P are good, so you just repeat stage 2. If you crash at stage 3, then on recovery you won't find the marker of stage 4, and so will know that SCRATCH is unreliable and must be regenerated. Having STAGE1/STAGE2, you are able to do so. If you crash at stage 4, you will BELIEVE that SCRATCH must be regenerated, even if you could avoid this -- but you lose nothing in regenerating except a little time. By the same token, if you crash during 5, or before 6 is committed to disk, you just repeat stages 5 and 6. You know you don't have to regenerate SCRATCH because stage 4 was committed to disk. If you crash after stage 1, you will still have a good SCRATCH to copy.

All this assumes that 10 MB is more than a cache's (OS + hard disk if writeback) worth of data. If it is not, raise to 32 or 64 MB. Recovery will be proportionately slower.

It might help to flush() and sync(), if these functions are available, after every write-stage has been completed.

Total write time is a bit more than twice normal, because of the need of "writing twice" in order to be sure.

نصائح أخرى

You have to work with the large file in chunks. Break a piece of the file off, encrypt it, and save it to disk; once saved, discard the unencrypted piece. Repeat. To decrypt, grab an encrypted piece, decrypt it, store the unencrypted chunk. Discard the encrypted piece. Repeat. When done decrypting the pieces, concatenate them.

Surely this is doable.

The "largest" (not large at all however) problem is that when you encrypt say 128 Mb of original data, you need to remove them from the source file. To do this you need to copy the remainder of the file to the beginning and then truncate the file. This would take time. During this step power can be turned off, but you don't care much -- you know the size of data you've encrypted (if you encrypt data by blocks with size multiple to 16 bytes, the size of the encrypted data will be equal to size that was or has to be removed from the decrypted file). Unfortunately it seems to be easier to invent the scheme than to explain it :), but I really see no problem other than extra copy operations which will slowdown the process. And no, there's no generic way to strip the data from the beginning of the file without copying the remainder to the beginning.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top