Pros/Cons of Static and Dynamic Instrumentation

https://stackoverflow.com/questions/2038625

19-09-2019
|

Question

There are many static and dynamic instrumentation tools. Soot is a static instrumentation tool for Java bytecode. Pin and Valgrind are dynamic instrumentation tools for binaries.

What are pros and cons for static and dynamic instrumentation tools? I think static instrumentation tools are better in terms of runtime performance, whereas dynamic tools are more powerful. Please compare them in terms of ability and performance.

Plus, what is the difference using instrumentation tools from writing LLVM pass?

Solution

I'm assuming the need is to discover code that takes significant time and that you could optimize to save that time. That is a different goal from just timing routines.

I'm skeptical of static analyzers because everything depends on the input data mix.

Dynamic instrumentation tries to measure properties of functions, such as: self time and total time, absolute, average, and percent. Also call counts, and each routine's role in the call graph.

Dynamic instrumentation (a la gprof) has been the de-facto standard for decades, but it is very far from being the last word. For one thing, it is important to realize that most of the statistics it gives you are missing the point in terms of your original need.

These days (IMHO) you need a sampling profiler that samples the call stack, not just the program counter. It should sample on wall-clock time, not just CPU time. Samples need not be drawn at high frequency. It should suppress sampling when the app is waiting for user input. It should give you information at the line or instruction level, not just the function level. The most important statistic it should give you for a line of code is the percentage of samples containing it, because that is the most direct measure of the time that can be saved if that line is optimized.

A few profilers can do this, Oprofile and RotateRight/Zoom in particular.

OTHER TIPS

The pros of static instrumentation is the fact that the analysis is not dependent on the input. The analysis happens on the original code and includes all paths of the code. Full coverage. This type of instrumentation usually rewrites the binary which is ready for execution without the need of another process at run-time. That also means that the code will run fast, with the only overhead coming from the injected code. The drawback of static instrumentation is the not detailed analysis which is caused due to the lack of run-time information and because of that, sometimes is very difficult to achieve your goals.

On the other side, dynamic instrumentation does includes every detail and information during the run-time of the code. In the most cases, the tools that perform dynamic instrumentation are easy to write. On the other hand, isn't able to achieve full code coverage due to the fact that the execution path is dependent on the inputs given. Also the fact that there's a need for an external process to be attached and instrument the original one makes things slower.

AFAIC, LLVM passes are used for static instrumentation, because the code generated is at compile time and is already written in the final binary and for sure includes all the pros and cons of static instrumentation techniques.

To conclude, it's a matter of what you need. You should choose the right tool for your job.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow