#
* *
Knowing the plaintext, how to discover the encryption scheme used? [closed]

###### https://stackoverflow.com/questions/231592

### Question

I have some char() fields in a DBF table that were left encrypted by a past developer in the project.

However, I know the plaintext result of the decryption of several records. How can I determine the function/algorithm/scheme to decrypt the original data? These are some sample fields:

For cryptext:

```
b5 01 02 c1 e3 0d 0a
```

plaintext should be:

```
3543921 or 3.543.921
```

And for cryptext:

```
41 c3 c5 07 17 0d 0a
```

plaintext should be

```
1851154 or 1.851.154
```

I believe `0d 0a`

is just padding. Was from data gathered in win-1252 encoding (dunno if matters)

**EDIT:** It's for the sake of curiosity and learning. I want to be able to undestand the encryption used(seems a simple one, although is binary data) to recover the value of the fields for the tuples whose plaintext I don't know.

**EDIT 2:** Added a couple samples.

### Solution

There is no easy way in general case. This question is too general. Try posting these plain + encrypted strings.

EDIT:

- for the sake of learning you can read this article : Cryptography on Wikipedia
if you really beleive the encryption is simple - check if it's a byte (or word) level XOR - see the following pseudocode

`for (i in originalString) { newString[i] = originalString[i] ^ CRYPT_BYTE; }`

### OTHER TIPS

Assuming it's not something as simple as a substitution cipher (try frequency analysis) or a poorly applied XOR (e.g., reusing the key; try XORing two ciphertexts with known plaintexts and then see whether the result is the XOR of the plaintexts; or try XORing the ciphertext with itself shifted by some number of bytes), you should probably assume it's well-known stream/block cipher with an unknown key (which most likely consists of ASCII characters). If you have a big enough sample of ciphertext-plaintext pairs, you could start by checking whether plaintexts with the same first few characters/bytes have ciphertexts with the same first characters/bytes. There you might also see whether it's a block or a stream cipher and whether there is any feedback mechanism involved. Padding, if present, might also suggest that it's a block cipher rather than a stream cipher.

Depending on how much effort you want to put into it, you should be able to get somewhere. Start by reading up on cryptanalysis, in particular the methods of cryptanalysis.

The things that will determine how easy this task will be are:

- how good the encryption method used is; if it's a recent, well-regarded method such as RSA or AES, you're probably out of luck
- how much ciphertext and plaintext you have -- the more the better
- what kind of data it is -- simple text is the easiest, while random data would be the hardest
- whether the data is all encrypted with the same key, or whether multiple keys have been used.

The key to success is don't be disheartened; the history of cryptanalysis is filled with stories of supposedly unbreakable codes being cracked; perhaps the most famous is the Enigma machine from World War II, the cracking of which contributed to the development of modern computers.

We can tell a few things from what you've provided:

- With a ciphertext length of 7 bytes in each case, it's unlikely to be a block cipher (since block ciphers encrypt a block at a time, their length will be a multiple of the blocksize, and a blocksize of 56 bits is pretty unlikely*).
- The length of the ciphertext and the number of characters in the plaintext is the same in each case, so it could be straightforward encoding of numbers as ascii with a stream cipher applied.
- XORing the plaintext (as ascii) and the ciphertext together gives neither a single repeated octet nor the same cryptostream for each, so it's not a trivial cipher. It's also not a simple stream cipher using the same key for both, unless some of the ciphertext bytes are an IV.
- The last two bytes are identical in ciphertext but not in plaintext. This could be a coincidence but also could be indicative of padding as you suggest. If they are padding, some other encoding mechanism must be used.

Do you know if all the encrypted values are integers, or are other values also possible?

Determining the algorithm used without the corresponding key may not be entirely useful.

If the text is small enough, and you have the plaintext, why would you ant to figure it out? Other than, of course, for curiosity sake?

There's no deterministic way to tell, but often there are hints in the ciphertext. Is it really encrypted (with some sort of key)? Or is it just hashed and (possibly) salted.

If it's hashed, you could get lucky and just google for a matching pair (assuming you have any that are dictionary words) because there are pre-hashed dictionaries already online.

If you have an example of the ciphertext, you could post it, someone might recognize the cipher format...

I think it's a misconception that XOR is an easily decryptable scheme. The theoretically strongest form of encryption is a one-time pad: simply a string of predetermined bits which you xor your plaintext with...

Finite XORs, on the other hand...