Вопрос

I have a method in a class which decrypts a variable, and returns it. I remove the returned variable with "del" after use.

What is the danger of these garbage values being accessed...and how can I best protect myself from them?

Here is the code:

import decrypter
import gc

# mangled variable names used
def decrypt(__var):
    __cleartext = decrypter.removeencryption(__var)
    return __cleartext

__p_var = "<512 encrypted password text>"
__p_cleartext = decrypt(__p_var)
<....do login with __p_cleartext...>
del  __p_var, __p_cleartext
gc.collect()

Could any of the variables, including __var and __cleartext be exploited at this point?

Thanks!


I've done a little more googling. Before I spend a few hours going down the wrong path...what I'm hearing is:

  1. Store the password as a salted hash on the system (which it is doing now).
  2. The salt for the hash should be entered in by the user at suite start (being done now)
  3. However, the salt should be held in C process and not python.
  4. The python script should pass the hash to to the C process for decryption.

The python script is handling the login for a mysql database, and the password is needed to open the DB connection.

If the code were along the lines of...

# MySQLdb.connect(host, user, password, database)
mysql_host = 'localhost'
mysql_db = 'myFunDatabase'
hashed_user = '\xghjd\xhjiw\xhjiw\x783\xjkgd6\xcdw8'
hashed_password = 'ghjkde\xhu78\x8y9tyk\x89g\x5de56x\xhyu8'
db = MySQLdb.connect(mysql_host, <call_c(hashed_user)>, <call_c(hashed_password)>, mysql_db])  

Would this resolve (at least) the issue of python leaving garbage all over?


P.s. I also found the post about memset (Mark data as sensitive in python) but I'm assuming if I use C to decrypt the hash, this is not helpful.

P.P.S. The dycrypter is currentlt a python script. If I were to add memset to the script and then "compile" it using py2exe or pyinstaller....would this actually do anything to help protect the password? My instincts say no, since all pyinstaller does is package up the normal interpreter and the same bytecode the local interpreter creates...but I don;t know enough about it...?


So...following Aya's suggestion of making the encryption module in C, how much of a discernible memory footprint would the following setup leave. Part of the big issue is; the ability to decrypt the password must remain available throughout the run of the program as it will be called repeatedly...it's not a one-time thing.

Make a C object which is started when the user logins in. It contains the decryption routine and the holds a copy of the salt entered by the user at login. The stored salt is obscured in the running object (in memory) by having been hashed by it's own encryption routine using a randomly generated salt.

The randomly generated salt would still have to be held in a variable in the object too. This is not really to secure the salt, but just to try and obfuscate the memory footprint if someone should take a peek at it (making the salt hard to identify). I.e. c-obj

mlock() /*to keep the code memory resident (no swap)*/

char encrypt(data, salt){ 
    (...) 
    return encrypted_data
}

char decrypt(data, salt){ 
    (...) 
    return decrypted_data
}

stream_callback(stream_data){
    return decrypt(stream_data, decrypt(s-gdhen, jhgtdyuwj))
}

void main{ 
    char jhgtdyuwj=rand();
    s-gdhen = encrypt(<raw_user_input>, jhgtdyuwj);
}

Then, the python script calls the C object directly, which passes the unencrypted result right into the MySQLdb call without storing any returns in any variable. I.e.

#!/usr/bin/python
encrypted_username = 'feh9876\xhu378\x&457(oy\x'
encrypted_password = 'dee\x\xhuie\xhjfirihy\x^\xhjfkekl'
# MySQLdb.connect(host, username, password, database)
db = MySQLdb.connect(self.mysql_host,
                     c-obj.stream_callabck(encrypted_username),
                     c-obj.stream_callback(encrypted_password),
                     self.mysql_database)

What kind of memory footprint might this leave which could be snooped?

Это было полезно?

Решение 3

Any security system is only as strong as its weakest link.

It's difficult to tell what the weakest link is in your current system, since you haven't really given any details on the overall architecture, but if you're actually using Python code like you posted in the question (let's call this myscript.py)...

#!/usr/bin/python
encrypted_username = 'feh9876\xhu378\x&457(oy\x'
encrypted_password = 'dee\x\xhuie\xhjfirihy\x^\xhjfkekl'
# MySQLdb.connect(host, username, password, database)
db = MySQLdb.connect(self.mysql_host,
                     c-obj.stream_callabck(encrypted_username),
                     c-obj.stream_callback(encrypted_password),
                     self.mysql_database)

...then regardless of how or where you decrypt the password, any user can come along and run a script like this...

import MySQLdb

def my_connect(*args, **kwargs):
    print args, kwargs
    return MySQLdb.real_connect(*args, **kwargs)

MySQLdb.real_connect = MySQLdb.connect
MySQLdb.connect = my_connect
execfile('/path/to/myscript.py')

...which will print out the plaintext password, so implementing the decryption in C is like putting ten deadbolts on the front door, but leaving the window wide open.

If you want a good answer on how to secure your system, you'll have to provide some more information on the overall architecture, and what attack vectors you're trying to prevent.

If someone manages to hack root, you're pretty much screwed, but are better ways to conceal the password from non-root users.

However, if you're satisfied that the machine you're running this code on is secure (in the sense that it can't be accessed by any 'unauthorized' users), then none of this password obfuscation stuff is necessary - you may as well just put the cleartext password directly into the Python source code.


Update

Regarding architecture, I meant, how many separate servers are you running, what responsibilities do they have, and how are they meant to communicate with each other, and/or the outside world?

Assuming the primary goal is to prevent unauthorized access to the MySQL server, and assuming MySQL runs on a different server to the Python script, then why are you more concerned about someone gaining access to the server running the Python script, and getting the password for the MySQL server, rather than gaining access to the MySQL server directly?

If you're using a 'salt' as a decryption key for the encrypted MySQL password, then how does an authorized user pass that value to the system? Do they have to login to the server via, say, ssh, and run the script from the commandline, or it this something accessible via, say, a webserver?

Either way, if someone does compromise the system running the Python script, they merely have to wait until the next authorized user comes along, and 'sniff' the 'salt' they enter.

Другие советы

Even if you call gc.collect and those strings are deallocated, they might still remain in memory. Also, strings are immutable, which means you have no (standard) way of overwriting them. Also note that if you have performed operations on those strings some copies of them might be lying around.

So don't use strings if possible.

You need to overwrite the memory (and even then, the memory might be dumped somewhere, like into a page file). Use a byte-array and overwrite the memory when you're done.

If no other references to the value exist, the your gc.collect normally destroys the object.

However, something as simple as string interning or cacheing may keep an unexpected reference, leaving the value alive in memory. Python has a number of implementations (PyPy, Jython, PyPy) that do different things internally. The language itself makes very few guarantees about whether or when the value would actually get erased from memory.

In your example, you also use name mangling. Because the mangling is easily reproduced by hand, this doesn't add any security at all.

One further thought: It isn't clear what your security model is. If the attacker can call your decrypt function and run arbitrary code in the same process, what would prevent them from wrapping decrypt to keep a code of the inputs and outputs.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top