Question

This question is from Section 15.5 of Introduction to Algorithms (third edition).

We are given sequence of keys, $ k = \{ k_{1},k_{2},\dots,k_{n} \}$, where $k_{1}<k_{2} <\dots<k_{n} $.

For each key $k_{i}$, where $1\leq i \leq n$, we have a probability $p_{i}$ that a search will be for $k_{i}$.

We let the sequence $d =\{ d_{0},d_{1},\dots,d_{n} \}$, where $d_{0}<d_{1} <\dots<d_{n} $, be for values not in $k$.

For each key $d_{i}$, where $0\leq i \leq n$, we have a probability $q_{i}$ that a search will be for $d_{i}$.

The goal is to construct an optimal binary search tree.

Let us define $e[i,j]$ as the expected cost of searching an optimal binary search tree containing the keys $k_{i},\dots,k_{j}$.

Let $w(i,j) = \sum_{l=i}^{j} p_{l} + \sum_{l=i-1}^{j} q_{l}$.

The book gives the following equation as the recurrence formula for forming the optimal binary search tree: $$ e[i,j] = \begin{cases} q_{i-1} & \text{if } j = i-1, \\ \displaystyle\min_{i\le r\le j} \{e[i,r-1]+r[r+1],j]+w(i,j)\} & \text{if } i \leq j. \end{cases} $$

This formula makes sense for $i\leq j$, but I don't understand the case $j = i-1$.

Why is $e[i,i-1] = q_{i-1}$?

Was it helpful?

Solution

Quoting Introduction to Algorithms:

The easy case occurs when $j = i-1$. Then we have just the dummy key $d_{i-1}$. The expected search cost is $e[i,i-1] = q_{i-1}$.

In slightly more detail, $e[i,j]$ is supposed to be the cost of the optimal binary search tree for $k_i,\ldots,k_j$. It is important to understand what cost means here. The cost is with respect to the following distribution on the input:

  • With probability $p_i$, key $k_i$ is being searched.
  • With probability $q_i$, a dummy key $d_i$ between $k_i$ and $k_{i+1}$. (When $i = 0$, it is a key smaller than $k_1$, and when $i = n$, it is a key larger than $k_n$.)

Unfortunately, the textbook is being extremely sloppy, and never bothers to explain what they mean by "the cost of the optimal binary search tree for $k_i,\ldots,k_j$". It seems that they are measuring the expected cost with respect to the stated keys as well as with respect to the dummy keys $d_{i-1},\ldots,d_j$. That is, they want to minimize $$ q_{i-1} \mathit{cost}(d_{i-1}) + p_i \mathit{cost}(k_i) + q_{i+1} \mathit{cost}(d_{i+1}) + \cdots + q_{j-1} \mathit{cost}(d_{j-1}) + p_j \mathit{cost}(k_j) + q_j \mathit{cost}(d_j). $$ When $j = i-1$, there are no real keys, but there is a dummy key $d_{i-1} = d_j$, and so the cost is $q_{i-1} \mathit{cost}(d_{i-1})$. The optimal cost here is clearly $1$, and that's how they get $q_{i-1}$.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top