Question

Is it possible to recover sourcecode from an HHVM authoritative repo file? I'd like to use HHVM for basic tamper-proofing when doing local installs of my commercial SaaS product.

I imagine (haven't checked) the sqlite3 db contains bytecode and, given PHP's dynamic nature, variable names.. since *.pyc can be reversed in a quite straight-forward way, should i assume the same is possible here? Even if no tools are currently available?

Was it helpful?

Solution

Yes, it is possible to disassemble HHVM's bytecode repository and reconstruct something close the original source. While HHVM does not provide any tools for this at present, HipHop bytecode (HHBC) is pretty close to the original source and contains rich metadata that includes local variable names, function names, etc. In that respect, HHBC bears some similarity to Java's bytecode or .NET's IL.

It might be possible to strip some of this metadata, but a lot of it is needed to handle stuff like "$f(..)", "call_user_func(..)", "class_exists(..)", and "$$x", not to mention the reflection APIs (ReflectionClass, ReflectionFunction, etc).

You might want to try one of the many PHP->PHP obfuscators out there (disclaimer: I haven't tried any of these obfuscators). Some of the better PHP->PHP obfuscators attempt to detect if your code uses a function name or class name in a "dynamic" way and try to avoid renaming these classes or functions, but I would imagine there may be some corner cases where these heuristics fail and some amount of manual tuning or adjustments are required.

Also, depending on your situation, it might be possible to use file system permissions to solve your problem (i.e. prevent regular users on your server from being able to access the bytecode repository), though it sounds like this might be outside of your control for your use case.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top