Question

I vaguely recall reading somewhere that Scala's immutable indexed sequence operations are O(log n), but that the base of the logarithm is large enough so that for all practical purposes the operations are almost like O(1). Is that true?

How is IndexedSeq implemented to achieve this?

Was it helpful?

Solution

The default implementation of immutable.IndexedSeq is Vector. Here's an excerpt from relevant documentation about its implementation:

Vectors are represented as trees with a high branching factor (The branching factor of a tree or a graph is the number of children at each node). Every tree node contains up to 32 elements of the vector or contains up to 32 other tree nodes. Vectors with up to 32 elements can be represented in a single node. Vectors with up to 32 * 32 = 1024 elements can be represented with a single indirection. Two hops from the root of the tree to the final element node are sufficient for vectors with up to 2^15 elements, three hops for vectors with 2^20, four hops for vectors with 2^25 elements and five hops for vectors with up to 2^30 elements. So for all vectors of reasonable size, an element selection involves up to 5 primitive array selections. This is what we meant when we wrote that element access is “effectively constant time”.

immutable.HashSet and immutable.HashMap are implemented using a similar technique.

OTHER TIPS

IndexedSeq is a Vector, which is a tree (trie, actually) structure with a fanout of 32. So, not counting memory locality, you never get over a O(log n) factor of about 6--compare with a binary tree where it ranges from 1 to ~30.

That said, if you count memory locality also, you will notice a huge difference between indexing into a 1G element Vector and a 10 element Vector. (You'll notice a pretty big difference with an Array also.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top