Purely functional concurrent skip list

https://stackoverflow.com/questions/3489560

28-09-2019
|

Pergunta

Skip lists (Pugh, 1990) provide sorted dictionaries with logarithmic-time operations like search trees but skip lists are much more amenable to concurrent updates.

Is it possible to create an efficient purely functional concurrent skip list? If not, is it possible to create any kind of efficient purely functional concurrent sorted dictionary?

Solução

The property of skip lists that makes them good for concurrent updates (namely that most additions and subtractions are local) also makes them bad for immutability (namely that a lot of earlier items in the list point eventually to the later items, and would have to be changed).

Specifically, skip lists consist of structures that look like so:

NODE1 ---------------------> NODE2 ---------...
  |                           |
  V                           V
NODE1a --> NODE1b ---------> NODE2a --> NODE2b --> NODE2c --- ...

Now, if you have an update that, say, deletes NODE2b or NODE1b, you can take care of it very locally: you just point 2a to 2c or 1a to 2a respectively and you're done. Unfortunately, because the leaf nodes all point one to another, it's not a good structure for a functional (immutable) update.

Thus, tree structures are better for immutability (as the damage is always locally limited--just the node you care about and its direct parents up through the root of the tree).

Concurrent updates don't work well with immutable data structures. If you think about it, any functional solution has an update of A as f(A). If you want two updates, one given by f and one given by g, you pretty much have to do f(g(A)) or g(f(A)), or you have to intercept the requests and create a new operation h = f,g that you can apply all in one go (or you have to do various other highly clever stuff).

However, concurrent reads work fantastically well with immutable data structures since you are guaranteed to have no state change. If you don't assume that you can have a read/write loop that resolves before any other write can interrupt, then you never have to lock on read.

Thus, write-heavy data structures are probably better implemented mutably (and with something like a skip list where you only need to lock locally), while read-heavy data structures are probably better implemented immutably (where a tree is a more natural data structure).

Outras dicas

Andrew McKinlay's solution is the real "true" functional solution for a real skip-list here, but it has a downside. You pay logarithmic time to access an element, but now mutation beyond the head element becomes hopeless. The answer you want is buried down in countless path-copies!

Can we do better?

Part of the problem there is that there are multiple paths from -infinity to your item.

But if you think through the algorithms for searching a skiplist, you never use that fact.

We can think of each node in the tree as having a preferred link, the top-most link to it from the left, which in some sense can be thought of as 'owning' that entry.

Now we can consider the notion of a 'finger' into a data structure, which is a functional technique that enables you to focus on one particular element, and supply a path back to the root.

Now we can start with a simple skip-list

-inf-------------------> 16
-inf ------> 8 --------> 16
-inf -> 4 -> 8 -> 12 --> 16

Expanding it by level:

-inf-------------------> 16
  |                       |
  v                       v
-inf ------> 8 --------> 16
  |          |            |
  v          v            v
-inf -> 4 -> 8 -> 12 --> 16

Strip out all but the preferred pointers:

-inf-------------------> 16
  |                       |
  v                       v
-inf ------> 8           16
  |          |            |
  v          v            v
-inf -> 4    8 -> 12     16

Then you could move a 'finger' to position 8 by tracking all the pointers you'd have to flip to get there.

-inf ------------------> 16
   ^                      |
   |                      v
-inf <------ 8           16
   |         |            |
   v         v            v
-inf -> 4    8 -> 12     16

From there it is possible to delete 8, pushing the finger somewhere else, and you can continue to navigate through the structure with the finger.

Looked at this way, we can see that the privileged paths in a skip list form a spanning tree!

Moving 1 step with the finger is an O(1) operation if you only have privileged pointers in the tree and use "skinny nodes" like this. If you used fat nodes, then finger movement left/right would be potentially more expensive.

All operations remain O(log n) and you can use a randomized skip-list structure or a deterministic one as usual.

That said, when we break the skiplist down into a notion of a preferred path then we get that a skip-list is just a tree with some redundant non-preferred links we don't need for insert/search/delete, such that the length of each of those paths from the upper right is O(log n) either with high probability or guaranteed depending on your update strategy.

Even without the finger you can maintain O(log n) expected time per insert/delete/update in a tree with this form.

Now, the key word in your question that doesn't make sense is "concurrent". A purely functional data structure doesn't have a notion of in-place mutation. You always produce a new thing. In some sense concurrent functional updates are easy. Everybody gets their own answer! They just can't see each others'.

Not a skip list, but seems to match the problem description: Clojure's persistent red-black trees (see PersistentTreeMap.java). The source contains this notice:

/**
 * Persistent Red Black Tree
 * Note that instances of this class are constant values
 * i.e. add/remove etc return new values
 * <p/>
 * See Okasaki, Kahrs, Larsen et al
 */

These trees maintain the order of elements and are "persistent" in the sense in which Rich Hickey uses the word (immutable and able to maintain their performance guarantees as updated versions are constructed).

In case you want to play around with them, you can construct instances in Clojure code with the function sorted-map.

If you only need to cons on the front of the skip list, then it should be possible to make a persistent immutable version.

The advantage of this kind of skip list would be "random" access. e.g. You could access the n'th element faster than you could in a regular single linked list.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow