Pergunta

I'm looking for some standard terminology, metrics and/or applications of the consideration of density and sequentiality of algorithms.

When we measure algorithms we tend to give the big-Oh notation such as $O(n)$ and usually we are measuring time complexity. Somewhat less frequently, though still often, we'll also measure the space complexity of an algorithm.

Given current computing systems however the density of memory and the sequence in which it is accessed plays a major role in the practical performance of the algorithm. Indeed there are scenarios where a time complexity algortihm of $O(\log n)$ with disperse random memory access can be slower than a $O(n)$ algorithm with dense sequential memory access. I've not seen these aspects covered in formal theory before; surely it must exist and I'm just ignorant here.

What are the standard metrics, terms, and approaches to this space density and access sequentiality?

Foi útil?

Solução

There are a few models that take into account the number of memory transfers an algorithm makes. One of the most successful ones is the cache-ideal model introduced by Frigo et al, Cache-oblivious algorithms, 1999. It is based on the external memory model introduced by Aggarwal and Vitter, The Input/Output Complexity of Sorting and Related Problems, 1988. The notes by Demaine give a nice, readable overview of the model and of some of the algorithms and data structures out there. Alternatively, check out the chapter on cache-oblivious data structures in the Handbook of Data Structures and Applications. If you prefer videos, Demaine covers all the basics and perhaps even more on the MIT Introduction to Algorithms lectures.

The model assumes no knowledge of the number of levels in the memory hierarchy. In addition, there are no parameters such as cache-line length or cache size to configure. One property is that if an algorithm performs optimally in the cache-ideal model, it performs optimally on any $k$-level memory hierarchy. Furthermore, algorithms that do perform optimally in the model are called cache-oblivious. They offer the advantages of simplicity, robustness and portability over the cache-aware algorithms, whose resource usage must be managed explicitly.

The ideal-cache model has two complexity measures. The first one is work complexity, that is the conventional running time of an algorithm in the RAM model. The second one comes from the external memory model, that is the number of memory transfers. The ideal-cache model however modifies the external memory model in a few ways. It has a optimal block replacement strategy, automatic replacement and full associativity. All of these assumptions are justified by a collection of reduction by Frigo et al.

Algorithms and data structures are then analyzed in the model. For example, consider the classic binary search. The runtime leads to a recurrence

$$T(N) = T(N/2) + O(1).$$

Looking at the recursion tree, each level attributes an $\log N$ factor yielding the solution of $\Theta(\log N)$. When analyzing algorithms in the RAM model, we often assume the base case of $T(1)=O(1)$ for our recurrences. However, in the cache-ideal model we often let the algorithm divide the problem all the way, but consider the point at which the problem either fits the cache or fits in $O(B)$ number of memory blocks. In fact here, we can we can improve on our analysis by using a stronger base case of $T(O(B))=O(1)$. With this, the classic binary search incurs $\Theta(\log N - \log B)$ memory transfers. What is more, the classic binary search is not optimal the in cache-ideal model. There are more clever ways needed to achieve the optimal $\Theta(\log_B N)$ bound.

I guess the success of the model comes from the fact that it is easy and fun to work with, and it also works in practice, meaning it has been fairly effective at predicting the costs of an algorithm on real systems. There are a lot of cache-oblivious algorithms and data structures. For example, we have cache-oblivious algorithms for many problems in computational geometry, scanning of dynamic sets, bioinformatics, stencil computations, linear algebra, suffix array construction and for some fundamental graph problems using cache-oblivious priority queues. The main techniques that are effective are linear scanning of memory and divide-and-conquer.

In practice, the overall conclusion has been such that cache-oblivious algorithms often outperform RAM algorithms. However, the cache-aware algorithms are usually even faster, since they are tuned for a specific problem size and memory hierarchy. For a nice example see Ladner et al., A Comparison of Cache Aware and Cache Oblivious Static Search Trees Using Program Instrumentation, 2002.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a cs.stackexchange
scroll top