Pergunta

Suppose we're receiving numbers in a stream. After each number is received, a weighted sum of the last $N$ numbers needs to be calculated, where the weights are always the same, but arbitrary.

How efficiently can this done if we are allowed to keep a data structure to help with the computation? Can we do any better than $\Theta(N)$, i.e. recomputing the sum each time a number is received?

For example: Suppose the weights are $W= \langle w_1, w_2, w_3, w_4\rangle$. At one point we have the list of last $N$ numbers $L_1= \langle a, b, c, d \rangle>$, and the weighted sum $S_1=w_1*a+w_2*b+w_3*c+w_4*d$.

When another number, $e$, is received, we update the list to get $L_2= \langle b,c,d,e\rangle$ and we need to compute $S_2=w_1*b+w_2*c+w_3*d+w_4*e$.

Consideration using FFT A special case of this problem appears to be solvable efficiently by employing the Fast Fourier Transform. Here, we compute the weighed sums $S$ in multiples of $N$. In other words, we receive $N$ numbers and only then can we compute the corresponding $N$ weighed sums. To do this, we need $N-1$ past numbers (for which sums have already been computed), and $N$ new numbers, in total $2N-1$ numbers.

If this vector of input numbers and the weight vector $W$ define the coefficients of the polynomials $P(x)$ and $Q(x)$, with coefficients in $Q$ reversed, we see that the product $P(x)\times Q(x)$ is a polynomial whose coefficients in front of $x^{N-1}$ up to $x^{2N-2}$ are exactly the weighted sums we seek. These can be computed using FFT in $\Theta(N*\log (N))$ time, which gives us an average of $Θ(\log (N))$ time per input number.

This is however not a solution the the problem as stated, because it is required that the weighted sum is computed efficiently each time a new number is received - we cannot delay the computation.

Foi útil?

Solução

Here is an elaboration of your approach. Every $m$ iterations, we use the FFT algorithm to compute $m$ values of the convolution in time $O(n\log n)$, assuming that the subsequent $m$ values are zero. In other words, we are computing $$ \sum_{i=0}^{n-1} w_i a_{t-i+k}, \quad 0 \leq k \leq m-1, $$ where $w_i$ are the $n$ weights (or the reverse weights), $a_i$ is the input sequence, $t$ is the current time, and $a_{t'} = 0$ for $t' > t$.

For each of the following $m$ iterations, we are able to calculate the required convolution in time $O(m)$ (the $i$th iteration needs time $O(i)$). So the amortized time is $O(m) + O(n\log n/m)$. This is minimized by choosing $m = \sqrt{n\log n}$, which gives an amortized running time of $O(\sqrt{n\log n})$.

We can improve this to worst-case running time of $O(\sqrt{n\log n})$ by breaking the computation into parts. Fix $m$, and define $$ b_{T,p,o} = \sum_{i=0}^{m-1} w_{pm+i} a_{Tm-i+o}, \quad C_{T,p} = b_{T,p,0}, \ldots, b_{T,p,m-1}. $$ Each $C_{T,p}$ depends only on $2m$ inputs, so it can be computed in time $O(m\log m)$. Also, given $C_{\lfloor t/m \rfloor-p,p}$ for $0 \leq p \leq n/m-1$, we can compute the convolution in time $O(n/m + m)$. The plan therefore is to maintain the list $$ C_{\lfloor t/m \rfloor-p,p}, \quad 0 \leq p \leq n/m-1. $$ For each period of $m$ inputs, we need to update $n/m$ of these. Each update takes time $O(m\log m)$, so if we spread these updates evenly, each input will take up work $O((n/m^2) m\log m) = O((n/m) \log m)$. Together with computing the convolution itself, the time complexity per input is $O((n/m)\log m + m)$. Choosing $m = \sqrt{n\log n}$ as before, this gives $O(\sqrt{n\log n})$.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a cs.stackexchange
scroll top