سؤال

I have an event stream and a key-val storage. The value size is limited by 4Kb. The event rate is not very heavy - maximum hundreds a day.

In this value I need to store a serialized representation of a data structure that provides an efficient mechanism for reading, storing and updating aggregated event count over a period of 3 months with daily and weekly aggregations and sliding windows of 1/2 an hour.

The solution needs efficiently perform the following tasks for both simple event count aggregations and for the event count standard deviation. (the max period for all the tasks mentioned below is 3 months):

  1. constant updates (in lazy manner - as the corresponding event arrives) - in case that the latest calculated aggregations are too old - throw the outdated data and create new aggregations
  2. update triggered by read requests (user requests some info e.g event count for specific user, standard deviation of event count for single user etc) in case that the latest calculated aggregations are too old - throw them

I wonder: is there any java open source framework that can assist implementing the above?

I would also appreciate design recommendations: design patterns etc.

The solution is not difficult to implement from scratch using standard java API, but before doing it I would appreciate some open source framework suggestions (if any).

Googling for the solution did not lead me anywhere except for some theoretical articles, SQL based solutions and IBM (non open source toolkit called SPL).

هل كانت مفيدة؟

المحلول

Take a look at Esper.

Or StreamCruncher.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top