Question

So lately I have been toying around with how Mathematica's pattern matching and term rewriting might be put to good use in compiler optimizations...trying to highly optimize short blocks of code that are the inner parts of loops. Two common ways to reduce the amount of work it takes to evaluate an expression is to identify sub-expressions that occur more than once and store the result and then use the stored result at subsequent points to save work. Another approach is to use cheaper operations where possible. For instance, my understanding is that taking square roots take more clock cycles than additions and multiplications. To be clear, I am interested in the cost in terms of floating point operations that evaluating the expression would take, not how long it takes Mathematica to evaluate it.

My first thought was that I would tackle the problem developing using Mathematica's simplify function. It is possible to specify a complexity function that compares the relative simplicity of two expressions. I was going to create one using weights for the relevant arithmetic operations and add to this the LeafCount for the expression to account for the assignment operations that are required. That addresses the reduction in strength side, but it is the elimination of common subexpressions that has me tripped up.

I was thinking of adding common subexpression elimination to the possible transformation functions that simplify uses. But for a large expression there could be many possible subexpressions that could be replaced and it won't be possible to know what they are till you see the expression. I have written a function that gives the possible substitutions, but it seems like the transformation function you specify needs to just return a single possible transformation, at least from the examples in the documentation. Any thoughts on how one might get around this limitation? Does anyone have a better idea of how simplify uses transformation functions that might hint at a direction forward?

I imagine that behind the scenes that Simplify is doing some dynamic programming trying different simplifications on different parts of the expressions and returning the one with the lowest complexity score. Would I be better off trying to do this dynamic programming on my own using common algebraic simplifications such as factor and collect?

EDIT: I added the code that generates possible sub-expressions to remove

(*traverses entire expression tree storing each node*)
AllSubExpressions[x_, accum_] := Module[{result, i, len},
  len = Length[x];
  result = Append[accum, x];
  If[LeafCount[x] > 1,
   For[i = 1, i <= len, i++,
     result = ToSubExpressions2[x[[i]], result];
     ];
   ];
  Return[Sort[result, LeafCount[#1] > LeafCount[#2] &]]
  ]

CommonSubExpressions[statements_] := Module[{common, subexpressions},
subexpressions = AllSubExpressions[statements, {}];
  (*get the unique set of sub expressions*)
  common = DeleteDuplicates[subexpressions];
  (*remove constants from the list*)
  common = Select[common, LeafCount[#] > 1 &];
  (*only keep subexpressions that occur more than once*)
  common = Select[common, Count[subexpressions, #] > 1 &];
  (*output the list of possible subexpressions to replace with the \
number of occurrences*)
  Return[common];
  ]

Once a common sub-expression is chosen from the list returned by CommonSubExpressions the function that does the replacement is below.

eliminateCSE[statements_, expr_] := Module[{temp},
temp = Unique["r"];
Prepend[ReplaceAll[statements, expr -> temp], temp[expr]]
]

At the risk of this question getting long, I will put a little example code up. I thought a decent expression to try to optimize would be the classical Runge-Kutta method for solving differential equations.

Input:
nextY=statements[y + 1/6 h (f[t, n] + 2 f[0.5 h + t, y + 0.5 h f[t, n]] + 
    2 f[0.5 h + t, y + 0.5 h f[0.5 h + t, y + 0.5 h f[t, n]]] + 
    f[h + t, 
     y + h f[0.5 h + t, y + 0.5 h f[0.5 h + t, y + 0.5 h f[t, n]]]])];
possibleTransformations=CommonSubExpressions[nextY]
transformed=eliminateCSE[nextY, First[possibleTransformations]]

Output:
{f[0.5 h + t, y + 0.5 h f[0.5 h + t, y + 0.5 h f[t, n]]], 
y + 0.5 h f[0.5 h + t, y + 0.5 h f[t, n]], 
0.5 h f[0.5 h + t, y + 0.5 h f[t, n]], 
f[0.5 h + t, y + 0.5 h f[t, n]], y + 0.5 h f[t, n], 0.5 h f[t, n], 
0.5 h + t, f[t, n], 0.5 h}

statements[r1[f[0.5 h + t, y + 0.5 h f[0.5 h + t, y + 0.5 h f[t, n]]]], 
y + 1/6 h (2 r1 + f[t, n] + 2 f[0.5 h + t, y + 0.5 h f[t, n]] + 
 f[h + t, h r1 + y])]

Finally, the code to judge the relative cost of different expressions is below. The weights are conceptual at this point as that is still an area I am researching.

Input:
cost[e_] := 
 Total[MapThread[
   Count[e, #1, Infinity, Heads -> True]*#2 &, {{Plus, Times, Sqrt, 
     f}, {1, 2, 5, 10}}]]
cost[transformed]

Output:
100
Was it helpful?

Solution

To identify repeating subexpressions, you could use something like this

(*helper functions to add Dictionary-like functionality*)

index[downvalue_, 
   dict_] := (downvalue[[1]] /. HoldPattern[dict[x_]] -> x) // 
   ReleaseHold;
value[downvalue_] := downvalue[[-1]];
indices[dict_] := 
  Map[#[[1]] /. {HoldPattern[dict[x_]] -> x} &, DownValues[dict]] // 
   ReleaseHold;
values[dict_] := Map[#[[-1]] &, DownValues[dict]];
items[dict_] := Map[{index[#, dict], value[#]} &, DownValues[dict]];
indexQ[dict_, index_] := 
  If[MatchQ[dict[index], HoldPattern[dict[index]]], False, True];


(*count number of times each sub-expressions occurs *)
expr = Cos[x + Cos[Cos[x] + Sin[x]]] + Cos[Cos[x] + Sin[x]];
Map[(counts[#] = If[indexQ[counts, #], counts[#] + 1, 1]; #) &, expr, 
  Infinity];
items[counts] // Column

OTHER TIPS

There are also some routines here implemented here by this author: http://stoney.sb.org/wordpress/2009/06/converting-symbolic-mathematica-expressions-to-c-code/

I packaged it into a *.M file and have fixed a bug (if the expression has no repeated subexpressions the it dies), and I am trying to find the author's contact info to see if I can upload his modified code to pastebin or wherever.

EDIT: I have received permission from the author to upload it and have pasted it here: http://pastebin.com/fjYiR0B3

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top