Question

I have some base-64 encoded encrypted data and noticed a fair amount of repetition. In a (approx) 200-character-long string, a certain base-64 character is repeated up to 7 times in several separate repeated runs.

Is this a red flag that there is a problem in the encryption? According to my understanding, encrypted data should never show significant repetition, even if the plaintext is entirely uniform (i.e. even if I encrypt 2 GB of nothing but the letter A, there should be no significant repetition in the encrypted version).

Was it helpful?

Solution

According to the binomial distribution, there is about a 2.5% chance that you'd see one character from a set of 64 appear seven times in a series of 200 random characters. That's a small chance, but not negligible. With more information, you might raise your confidence from 97.5% to something very close to 100% … or find that the cipher text really is uniformly distributed.

You say that the "character is repeated up to 7 times" in several separate repeated runs. That's not enough information to say whether the cipher text has a bias. Instead, tell us the total number of times the character appeared, and the total number of cipher text characters. For example, "it appeared a total of 3125 times in 1000 runs of 200 characters each."

Also, you need to be sure that you are talking about the raw output of a cipher. Cipher text is often encapsulated in an "envelope" like that defined by the Cryptographic Message Syntax. Of course, this enclosing structure will have predictable patterns.

OTHER TIPS

Well I guess it depends. Repetition in general is bad thing if it represents the same data.

Considering you are encoding it have you looked at data to see if you have something that repeats in those counts?

In order to understand better you gotta know what kind of encryption does it use. It could be just coincidence that they are repeating.

But if repetition comes from same data, then it can be a red flag because then frequency counts can be used to decode it.

What kind of encryption are you using? Home made or some industry standard?

It depends on how are you encrypting your data.

Base64 encoding a string may count as light obfuscation, but it is NOT encryption. The purpose of Base64 encoding is to allow any sort of binary data to be encoded as a safe ASCII string.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top