Question

It is possible to decompile .pyc files: Decompile Python 2.7 .pyc

Is it possible to `compile` python files so there is a human-unreadable code, like the c++ -> exe binary file? ..unlike the plaintext .py and very easily recoverable .pyc files? (I don't mind if it can be cracked by brute force)

Was it helpful?

Solution

Python is a highly dynamic language, and supports many different levels of introspection. Because of that, obfuscating Python bytecode is a mountainous task.

Moreover, your embedded python interpreter will still need to be able to execute the bytecode you ship with your product. And if the interpreter needs to be able to access the bytecode, then everyone else can too. Encryption won't help, because you still need to decrypt the bytecode yourself and then everyone else can read the bytecode from memory. Obfuscation only makes default tools harder, not impossible to use.

With that said, here is what you'd have to do to make it really bloody hard to read your application's Python bytecode:

  • Re-assign all python opcode values a new value. Rewire the whole interpreter to use different byte values for different opcodes.

  • Remove all as many introspection features as you can get away with. Your functions need to have closures, and codeobjects need constants still, but to hell with the locals list in the code object, for example. Neuter the sys._getframe() function, slash traceback information.

Both these steps require in-depth knowledge of how the Python interpreter works, and how the Python object model fits together. You will most certainly introduce bugs that will be hard to solve.

In the end, you have to ask yourself if this is worth it. A determined hacker can still analyze your bytecode, do a some frequency analysis to reconstruct the opcode table, and / or feed your program different opcodes to see what happens, and decipher all the obfuscation. Once a translation table is created, decompiling your bytecode is a snap, and reconstructing your code is not far away.

If all you want to do is prevent bytecode files from being altered, embed checksums for your .pyc files, and check those on startup. Refuse to load if they don't match. Someone will patch your binary to remove the checksum check or replace the checksums, but you won't have to put in nearly as much effort to provide at least some token protection from tampering.

OTHER TIPS

Every system with code encrypting can be attacked as the decrypting key must be somewhere present.

As such, it is just a question of effort.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top