Question

I had this weird idea for an encryption that I wanted to try out, it may be bad, and it may have done before, but I'm just doing it for fun. The short version of the question is: Is it possible to generate a long, deterministic and non-uniformly distributed string/sequence of numbers from a small seed?

Long(er) version: I was thinking to encrypt a text by changing encoding. The new encoding would be generated via Huffman algorithm. To work well, the Huffman algorithm would need a fairly long text with non uniform distribution. Then characters can have different bit-lengths which would be the primary strength of this encryption. The problem is that its impractical to enter in/remember a long text each time you want to decrypt the text. So I was wondering if it was possible to generate a text from password seed?

It doesn't matter what the text is, as long as it has non uniform distribution of characters and that the exact same sequence can be recreated each time you give it the same seed. Preferably, are there any functions/extensions in Python that can do this?

EDIT: To expand on the "strength" of varying bit length: if I have a string "test", ASCII values 116, 101, 115, 116, which gives bit values of 1110100 1100101 1110011 1110100

Then, say my Huffman algorithm generates encoding like t = 101 e = 1100111 s = 10001

The final string is 101 1100111 10001 101, if we encode this back to ASCII, we get 1011100 1111000 1101000, which is 3 entirely different characters, its "\xh". Obviously its impossible to perform any kind of frequency analysis or something like that on this.

Était-ce utile?

La solution 2

Based on DhruvPathak's straight forward answer with creating a simple random string of characters, I have two additions: ① a non-uniform distribution and ② a random translation to prevent prediction of the frequency of the letters:

translation = range(26)
random.shuffle(translation)  # ②
random_string = ''.join(chr(
  translation[random.randint(0, random.randint(1, 25))] + ord('a'))  # ①
  for _dummy in range(1000))

The non-uniformly distribution is achieved by using randint(randint(…)) which basically prefers the lower numbers as output.

In a first try I got this translation list:

[5, 18, 22, 16, 3, 20, 2, 4, 19, 24, 9, 21, 12, 15, 7, 0, 25, 11, 14, 17, 10, 8, 13, 6, 1, 23]

And a count of the characters in the resulting random_string (done by f = [ 0 ] * 25, for c in random_string: f[ord(c) - ord('a')] += 1, zip(*reversed(sorted(zip(f, range(26)))))[1]) gave this list:

(18, 5, 22, 16, 3, 20, 2, 4, 19, 24, 12, 21, 15, 9, 0, 7, 25, 14, 17, 10, 11, 13, 8, 1, 23, 6)

So, the outcome matches the expectation pretty well.

Autres conseils

This is a solution based on random module, which will generate the same sequence if given the same seed.

import random
from string import ascii_lowercase
from collections import Counter

seed_value = 3334
string_length = 50
random.seed(seed_value)
seq = [(x,random.randint(1,10)) for x in ascii_lowercase]
weighted_choice = lambda s : random.choice(sum(([v]*wt for v,wt in s),[]))
random_list = [weighted_choice(seq) for x in range(string_length)]
print("".join(random_list))
print("Test non uniform distribution...")
print(Counter(random_list))
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top