Transform a parse tree of a polynomial to a parse tree of its evaluation according to Horner's scheme

https://stackoverflow.com/questions/10052896

29-05-2021
|

Question

Could you please point me to an algorithm that takes a (binary) parse tree for evaluating a polynomial expression in a single variable and returns an equivalent parse tree that evaluates the polynomial according to Horner's rule.

The intended use case is in expression templates. The idea is that for a matrix x the parse tree obtained from

a + bx + c * x*x + d * x*x*x...

will get optimized into the corresponding parse tree of

a + x *( b + x( c + x*d))

Solution

You can get the monomial coefficients of the tree by a recursive function. Converting these coefficients and getting an expression according to Horner's law would be simple.

I can give you a simple recursive function that computes these values, even though a more efficient one may exist.

Theoretical stuff

First, let's formulate expressions. An expression E:

E = a0 + a1x + a2x^2 + ... + anx^n

can be written as an (n+1)-ary tuple:

(a0, a1, a2, ..., an)

Then, we define two operations:

Addition: Given two expressions E1 = (a0, ..., an) and E2 = (b0, ..., bm), the corresponding tuple of E1 + E2 is:

          {(a0+b0, a1+b1, ..., am+bm, a(m+1), ..., an) (n > m)
E1 + E2 = {(a0+b0, a1+b1, ..., an+bn, b(n+1), ..., bm) (n < m)
          {(a0+b0, a1+b1, ..., an+bn)                  (n = m)

that is, there are max(n,m)+1 elements, and the i^th element is computed by (with a C-ish syntax):

i<=n?ai:0 + i<=m?bi:0

Multiplication: Given two expressions E1 = (a0, ..., an) and E2 = (b0, ..., bm), the corresponding tuple of E1 * E2 is:
```
E1 * E2 = (a0*b0, a0*b1+a1*b0, a0*b2+a1*b1+a2*b0, ... , an*bm)
```
that is, there are n+m+1 elements, and the i^th element is computed by
```
sigma over {ar*bs | 0<=r<=n, 0<=s<=m, r+s=i}
```

The recursive function is therefore defined as follows:

tuple get_monomial_coef(node)
    if node == constant
        return (node.value)  // Note that this is a tuple
    if node == variable
        return (0, 1)        // the expression is E = x
    left_expr = get_monomial_coef(node.left)
    right_expr = get_monomial_coef(node.right)
    if node == +
        return add(left_expr, right_expr)
    if node == *
        return mul(left_expr, right_expr)

where

tuple add(tuple one, tuple other)
    n = one.size
    m = other.size
    for i = 0 to max(n, m)
        result[i] = i<=n?one[i]:0 + i<=m?other[i]:0
    return result

tuple mul(tuple one, tuple other)
    n = one.size
    m = other.size
    for i = 0 to n+m
        result[i] = 0
        for j=max(0,i-m) to min(i,n)
            result[i] += one[j]*other[i-j]
    return result

Note: in the implementation of mul, j should iterate from 0 to i, while the following conditions must also hold:

j <= n (because of one[j])
i-j <= m (because of other[i-j]) ==> j >= i-m

Therefore, j can go from max(0,i-m) and min(i,n) (which is equal to n since n <= i)

C++ implementation

Now that you have the pseudo-code, the implementation shouldn't be hard. For the tuple type, a simple std::vector would suffice. Therefore:

vector<double> add(const vector<double> &one, const vector<double> &other)
{
    size_t n = one.size() - 1;
    size_t m = other.size() - 1;
    vector<double> result((n>m?n:m) + 1);
    for (size_t i = 0, size = result.size(); i < size; ++i)
        result[i] = (i<=n?one[i]:0) + (i<=m?other[i]:0);
    return result;
}

vector<double> mul(const vector<double> &one, const vector<double> &other)
{
    size_t n = one.size() - 1;
    size_t m = other.size() - 1;
    vector<double> result(n + m + 1);
    for (size_t i = 0, size = n + m + 1; i < size; ++i)
    {
        result[i] = 0.0;
        for (size_t j = i>m?i-m:0; j <= n; ++j)
            result[i] += one[j]*other[i-j];
    }
    return result;
}

vector<double> get_monomial_coef(const Node &node)
{
    vector<double> result;
    if (node.type == CONSTANT)
    {
        result.push_back(node.value);
        return result;
    }
    if (node.type == VARIABLE)
    {
        result.push_back(0.0);
        result.push_back(1);  // be careful about floating point errors
                              // you might want to choose a better type than
                              // double for example a class that distinguishes
                              // between constants and variable coefficients
                              // and implement + and * for it
        return result;
    }
    vector<double> left_expr = get_monomial_coef(node.left);
    vector<double> right_expr = get_monomial_coef(node.right);
    if (node.type == PLUS)
        return add(left_expr, right_expr);
    if (node.type == MULTIPLY)
        return mul(left_expr, right_expr);
    // unknown node.type
}

vector<double> get_monomial_coef(const Tree &tree)
{
    return get_monomial_coef(tree.root);
}

Note: this code is untested. It may contain bugs or insufficient error checking. Make sure you understand it and implement it yourself, rather than copy-paste.

From here on, you just need to build an expression tree from the values this function gives you.

OTHER TIPS

You could use the following tranformation.

Assumption: the parse tree of the polynomial is in the order of increasing exponents -- if this assumption does not hold, the partial polynomes can be swapped around in the parse tree to make the assumption hold

Assumption: the parse tree holds exponential forms of the variable (e.g. x^2) instead of multiplicational forms (e.g. x*x), except for x^0 -- simple transformations can convert between the two in either direction

Assumption: the coefficients (if constant) in the polynom are non-zero -- this is to avoid having to deal with (a+0*x+c*x^2 -> a+x(cx) instead of a+cx^2)

Parse tree for a+b*x^1+c*x^2+d*x^3:

  .+..
 /    \
a   ...+....
   /        \
  *         .+..
 / \       /    \
b  ^      *      *
  / \    / \    / \
 x   1  c   ^  d   ^
           / \    / \
          x   2  x   3

Transformation to a+x(b+c*x^1+d*x^2)

  +
 / \
a   *
   / \
  x   +
     / \
    b   .+..
       /    \
      *      *
     / \    / \
    c   ^  d   ^
       / \    / \
      x   1  x   2

Transformation to a+x(b+x(c+d*x^1))

Then finally (a+x(b+x(c+d*x))):

The common transformation is:

 .            ->    .
  .           ->     .
   .          ->      .
    +         ->      .*..
   / \        ->     /    \
  *   N(k+1)  ->    ^      +
 / \          ->   / \    / \
ck  ^         ->  x   k  ck  N'(k+1)
   / \        -> 
  x   k       ->

where N'(k+1) is the same subtree as N(k+1) with each exponent j replaced with j-k (if k is 1, replace the x^k subtree with x)

The algorithm is then applied again(*) on N'(k+1) until its root is * (instead of +), indicating that the final partial polynom is reached. If the final exponent is 1, replace the exponential part with x (x^1 -> x)

(*) here is the recursion part

Note: if you keep track of the exponent changes in the N(k+1) subtrees cumulatively, you can put the last two steps together to transform N(k+1) recursively in one go

Note: If you want to allow negative exponents, then either

a) extract the highest negative exponent as the first term:

a*x^-2 + b*x^-1 + c + d*x + e*x^2  ->  x^-2(a+b*x+c*x^2+d*x^3+d*x^4)

and apply the above transformation

or b) separate the positive and negative exponential parts of the equation and appply the above transformation on each separately (you will need to "flip" the operands for the negative-exponent side and replace multiplication with division):

a*x^-2 + b*x^-1 + c + d*x + e*x^2  ->  [a+x^-2 + b*x-1] + [c + d*x + e*x^2] ->
-> [(a/x + b)/x] + [c+x(d+ex)]

this approach seems more complicated than a)

You just have to apply the following rules until you can't apply them anymore.

((x*A)*B)     -> (x*(A*B))
((x*A)+(x*B)) -> (x*(A+B)))
(A+(n+B))     -> (n+(A+B))  if n is a number

where A and B are subtrees.

Here is an OCaml code to do this:

type poly = X | Num of int | Add of poly * poly | Mul of poly * poly

let rec horner = function
  | X -> X
  | Num n -> Num n
  | Add (X, X) -> Mul (X, Num 2)
  | Mul (X, p)
  | Mul (p, X) -> Mul (X, horner p)
  | Mul (p1, Mul (X, p2))
  | Mul (Mul (X, p1), p2) -> Mul (X, horner (Mul (p1, p2)))
  | Mul (p1, p2) -> Mul (horner p1, horner p2) (* This case should never be used *)
  | Add (Mul (X, p1), Mul (X, p2)) -> Mul (X, horner (Add (p1, p2)))
  | Add (X, Mul (X, p))
  | Add (Mul (X, p), X) -> Mul (X, Add (Num 1, horner p))
  | Add (Num n, p)
  | Add (p, Num n) -> Add (Num n, horner p)
  | Add (p1, Add (Num n, p2))
  | Add (Add (Num n, p1), p2) -> Add (Num n, horner (Add (p1, p2)))
  | Add (p1, p2) -> horner (Add (horner p1, horner p2))

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow