Question

I developed an application in C++ using Crypto++ to encrypt information and store the file in the hard drive. I use an integrity string to check if the password entered by the user is correct. Can you please tell me if the implementation generates a secure file? I am new to the world of the cryptography and I made this program with what I read.

string integrity = "ImGood"
string plaintext = integrity + string("some text");
byte password[pswd.length()]; // The password is filled somewhere else
byte salt[SALT_SIZE]; // SALT_SIZE is 32
byte key[CryptoPP::AES::MAX_KEYLENGTH];
byte iv[CryptoPP::AES::BLOCKSIZE];

CryptoPP::AutoSeededRandomPool rnd;
rnd.GenerateBlock(iv, CryptoPP::AES::BLOCKSIZE);
rnd.GenerateBlock(salt, SALT_SIZE);

CryptoPP::PKCS5_PBKDF2_HMAC<CryptoPP::SHA512> gen;
gen.DeriveKey(key, CryptoPP::AES::MAX_KEYLENGTH, 32,
              password, pswd.length(),
              salt, SALT_SIZE,
              256);

CryptoPP::StringSink* sink = new CryptoPP::StringSink(cipher);
CryptoPP::Base64Encoder* base64_enc = new CryptoPP::Base64Encoder(sink);
CryptoPP::CFB_Mode<CryptoPP::AES>::Encryption cfb_encryption(key, CryptoPP::AES::MAX_KEYLENGTH, iv);
CryptoPP::StreamTransformationFilter* aes_enc = new CryptoPP::StreamTransformationFilter(cfb_encryption, base64_enc);
CryptoPP::StringSource source(plaintext, true, aes_enc);

sstream out;
out << iv << salt << cipher;

The information in the string stream "out" is then written to a file. Another thing is that I don't know what the "purpose" parameter in the derivation function means, I'm guessing it is the desired length of the key so I put 32, but I'm not sure and I can't find anything about it in the Crypto++ manual.

Any opinion, suggestion or mistake pointed out is appreciated.

Thank you very much in advance.

Was it helpful?

Solution

A file can be "secure" only if you define what you mean by "secure".

Usually, you will be interested in two properties:

  • Confidentiality: the data that is encrypted shall remain unreadable to attackers; revealing the plaintext data requires knowledge of a specific secret.

  • Integrity: any alteration of the data should be reliably detected; attackers shall not be able to modify the data in any way (even "blindly") without the modification being noticed by whoever decrypts the data.

Your piece of code, apparently, fulfils confidentiality (to some extent) but not integrity. Your string called "integrity" is a misnomer: it is not an integrity check. Its role is apparently to detect accidental password mistakes, not attacks; thus, it would be less confusing if that string was called passwordVerifier instead. An attacker can alter any bit beyond the first 48 bits without the decryption process noticing anything.

Adding integrity (the genuine thing) requires the use of a MAC. Combining encryption and a MAC securely is subject to subtleties; therefore, it is recommended to use for encryption and MAC an authenticated encryption mode which does both, and does so securely (i.e. that specific combination was explicitly reviewed by hordes of cryptographers). Usual recommended AE modes include GCM and EAX.

An important point to note is that, in a context where integrity matters, data cannot be processed before having been verified. This has implications for big files: if your huge file is adorned with a single MAC (whether "manually" or as part of an AE mode), then you must first verify the complete file before beginning to do anything with the plaintext data. This does not work well with streamed processing (e.g. if playing a huge video). A workaround is to split the data into individual chunks, each with its own MAC, but then some care must be taken about the ordering of chunks (attackers could try to remove, duplicate or reorder chunks): things become complex. Complexity, on a general basis, is bad for security.

There are contexts where integrity does not matter. For instance, if your attack model is "the attacker steals the laptop", then you only have to care about confidentiality. However, if the attack model is "the attacker steals the laptop, modifies a few files, and puts it back in my suitcase without me noticing", then integrity matters: the attacker could plant a modification in the file, and infer parts of the secret itself based on your external behaviour when you next access the file.

For confidentiality, you use CFB, which is a bit old-style, but not wrong. For the password-to-key transform, you use PBKDF2, which is fine; the iteration count, though, is quite low: you use 256. Typical values are 20000 or more. The theory is that you should make actual performance measures to set this count to as high a value as you can tolerate: a higher value means slower processing, both for you and for the attacker, so you ought to crank that up (depending on your patience).


Mandatory warning: you are in the process of defining your own crypto, which is a path fraught with perils. Most people who do that produce weak systems, and that includes trained cryptographers; in fact, being a trained cryptographer does not mean that you know how to define a secure protocol, but rather that you know better than defining your own protocol. You are thus highly encouraged to rely on an existing protocol (or format), rather than making your own. I suggest OpenPGP, with (for instance) GnuPG as support library. Even if you need for some reason (e.g. licence issues) to reimplement the format, using a standard format is still a good idea: it avoids introducing weaknesses, it promotes interoperability, and it can be tested against existing systems.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top