Question

Is any standard way to keep netflow data (aggregated by ports and destination IP, distinguished by source IP)?

Data input: netflow data (source IP, timestamp, octets), updates are very often

Request input: IP, range (two timestamps)

Request output: number of octets

It it possible to get O(log(n)) or better for data storing and requests? How?

Was it helpful?

Solution

Have a (hash) map of IP to a binary search tree indexed by timestamp.

To efficiently count the number of elements between two nodes in the binary search tree, you could have each node store a count of the nodes in the subtree of its left child (similar to a rope, then you could determine the index in the BST for both the start and end timestamp, giving you the number of elements in between.

The map lookup is expected O(1), the BST queries are O(log n) each, giving O(log n) total.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top