Question

We have some programs in a very basic imperative language, the programs contain some basic features: assignment, condition, loop, scalar variables, 1-dimensional arrays and 2-dimensional arrays... Note that a variable or an array may not be initialized (or already hold a value before the execution of a program).

My question is... is there some existing work to identify all the dependents and precedents of a program by static analysis of the syntax, without executing the program? Or, is it very trivial to do that?

Without giving a formal definition of dependent and precedent, I illustrate these 2 notions by the following program:

a[2,2] = 0;
i = 1;
While i <= 10 do {
  a[2,2] = a[2,2] + a[i,1]; 
  i = i + 1
}

Just from the syntax of the program, we can say, the script precedes a[2,2], because the value of a[2,2] is totally determined by the execution of the program. Also, we can say, the script depends on a[1,1] to a[10, 1], because the values of a[1,1] to a[10, 1] determine the running of the program.

I guess roughly we can say, if a variable is on the right side of the first assignment containing it, the script depends on it; if a variable ever appears on the left side of an assignment, the script precedes it. A variable can be both precedent and dependent for a script at same time.

So, is it really that trivial? Which elements may complicate the issue? Is there some existing research?

Was it helpful?

Solution

The analysis you want is called reaching definitions in the compiler community.

For your case you would put a "fake" definition of each variable you are interested in at the beginning of the program (representing definitions that are entering your script from outside) and then determine whether that definition reaches any of the uses. If no fake definition reaches any use then the variable is defined inside the script, otherwise its value may be coming in from outside. The interesting cases involve conditionals and loops (some variables may be defined on one side of a conditional and not the other.)

I can highly recommend the following text books if you want to learn more:

  • Aho, Alfred V.; Sethi, Ravi; Ullman, Jeffrey D.: Compilers: Principles, Techniques, and Tools, 1986.
  • Cooper, Keith D.; Torczon, Linda: Engineering a Compiler, 2005.

Reaching definitions analyzes scalars (so you would analyze each a[i,j] independently, as you did in your example.)

You could try to analyze entire arrays simultaneously, but this is a research topic (because parts of the array can be defined and other parts not). The classic papers on trying to analyze entire arrays are

  • Zhiyuan Li: Array privatization for parallel execution of loops. Int'l Conf on Supercomputing, 1992.
  • Dror E. Maydan, Saman P. Amarasinghe, and Monica S. Lam: Array data-flow analysis and its use in array privatization. Symp Principles of Programming Languages, (POPL-20):2-15, 1993.
  • Peng Tu and David Padua. Automatic array privatization. Int'l Workshop on Languages and Compilers for Parallel Computing, (LCPC-6):500–521, 1993.
Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top