Вопрос

Is there a way to detect and report duplicates in a stream ?

example: A stream of user login data containing tuples with user name and IP address. The goal is to detect login by the same user from different IP within the last 10 seconds.

I have taken a look at siddhi (wso2) and esper query languages, but I cannot figure how to achieve this.

Это было полезно?

Решение

In WSO2 CEP (Siddhi), you can achieve this using 'Patterns'

An example query:

from every a1 = authStream
            -> b1 = authStream[username == a1.username and ipAddress != a1.ipAddress] 
            within 10000 insert into alertStream a1.username as detectedUserName 

In above query, a1 and b1 refer to 2 events from the authStream and we're going to look for the pattern where their usernames are equal and ip addresses are different. Using the 'within' keyword, we can limit the time period this pattern should occur within (given in milliseconds). Then we're inserting matches found to the alertStream.

For more information on patterns, have a look at WSO2 CEP documentation on patterns. This page on advanced queries may also help.

Другие советы

Esper, for example using the match-recognize SQL pattern proposed standard:

select * from AuthStream.win:time(10 sec)
match_recognize (
  partition by username
  measures A.username as a_name
  pattern (A B)
  define 
    B as B.ipaddress != A.ipaddress 
)
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top