The data structure of libev watchers

https://stackoverflow.com/questions/13015721

13-07-2021
|

Question

Libev uses three data structures to storage different watchers.

Heap: for watchers that sorted by time, such as ev_timer and ev_periodic.

Linked list: such as ev_io, ev_signal, ev_child and etc.

Array: such as ev_prepare, ev_check, ev_async and etc.

There is no doubt about that uses heap to store timer watcher. But what is the criteria of selecting linked list and array?

The data structure that stores ev_io watchers seems a little complex. It first is an array that with fd as its index and the element in the array is a linked list of ev_io watcher. It is more convenient to allocate space for array if use linked list as element. Is it the reason?

Or just because of the insert or remove operation of ev_io is more frequently and the ev_prepare seems more stable?

Or any other reasons?

Solution

The expectation is that there are only a few (usually one, and almost always at most two) io watchers for the same fd (similarly for signals). Putting the list links into the watcher means no extra allocations are required, as would be required if an array was used per watcher. If a lot of I/O watchers were active on the same fd, then this linked list approach would be slower.

Arrays are used because insertion and removal is very fast (the watcher stores the array index). Using a 4-byte index also reduces memory requirements on 64 bit machines (12 bytes per watcher as opposed to e.g. 16 for a doubly-linked list), and using an array of pointers means that all pointers are near each other in memory, which improves cache efficiency when scanning, which is a frequent operation for some watchers.

Cache efficiency is also the reason why a 4-heap is used over a 2-heap, and the reason why the time values are copied to the heap data structure, to avoid having to access the watcher memory on heap operations.

Child watchers actually use a fixed-size hash table, and again the expectation is that the number of child watchers per hash bucket is small, so the list pointers become part of the watcher data structure.

OTHER TIPS

Probably the reason is that in typical scenario ev_io needs to be looked up by fd. Underlying library (epoll, select, or whatever) would provide fd that has some event. Libev would then simply use it as index, and the just iterate over the linked list of event watchers that need to be invoked. So it can be fast in firing events.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow