Question

I am working for a school district, and we are planning on using Drools to implement the following types for rules for the student population of the districts constituent schools:

  • If a student has 3 absences during a year their attendance metric moves to a WARN status.
  • If a student has 6 absences during a year their attendance metric moves to a CRITICAL status.
  • If a student has 3 major behavior incidents during a year their behavior metric moves to a WARN status.
  • If a student has 2 minor and 2 major behavior incidents during a year their behavior metric moves to a CRITICAL status.
  • ...these are just examples from the top of my head, but there are many more rules of a similar nature.

All of these rules can be simply expressed using Drools expert. Also, the processing of the rules for a student does not need to be synchronous. I have a couple of questions about the best way to implement this.

  1. From one standpoint this could be viewed a monitoring system for a stream of events. This made me think of creating a stateful session into which each new event would be inserted. However, the events happen over the course of 9 months and are relatively infrequent. Also, we could build a session per school, or a session per student.

    • Would keeping a session in memory for that long be a problem?
    • If the server failed, would we need to rebuild the session state from scratch or would it be advisable to take regular snapshots and just restore the facts that occurred since the time of the snapshot.
  2. Another option would be to persist a session for each student after an event is processed for that student. When the next event comes in we would retrieve their session from storage and insert the new fact. This way we wouldn't need to retrieve all the facts for each run of the engine to get the student's status. Would a configuration like this be supported? Are there any cons to doing this?

  3. A third approach would be to respond to a new fact for a student by retrieving all other facts the rules need to run, create a new KnowledgeSession and run the rules.

Any advice on what might be the best approach would be greatly appreciated.

Dave

Was it helpful?

Solution

I would go with solution number 2: one session per student. Given the fact that you are not going to be interacting too much with the session, I would keep it in a db and only restore it when needed: a new absence/incident arrives, the session for that student is restored from db, the facts are inserted, the rules are executed and the resulting status is retrieved.

The main disadvantage I see with this scenario is that creating rules about more than one student is not straightforward and you have to feed your facts to more than one session. For example, if you want rise an alert if you have more than 10 students with CRITICAL status in a single class. In this case, a session per class would be enough. So, as you can see, you have to decide what is better for you. But no matter the 'unit' you choose (school, class, student) I would still recommend you the execution flow I mentioned earlier.

Drools already comes with support for database persistence using JPA. You could get more information about this feature here: http://docs.jboss.org/drools/release/5.5.0.Final/drools-expert-docs/html_single/#d0e3961

The basic idea is that instead of creating your ksessions using kbase.newStatefulKnowledgeSession() you use the helper class called JPAKnowledgeService. This class will return a wrapper of a StatefulKnowledgeSession that will persist its state after each method invocation. In this class you will find 2 important methods: newStatefulKnowledgeSession(), to create a new ksession and loadStatefulKnowledgeSession() to retrieve an existing session from the database.

Hope it helps,

OTHER TIPS

There is a fourth option to make the maintenance simpler. Build one single stateful knowledge session for entire school district for all the students. After each event is processed successfully persist the session in case you need to reconstruct the working memory in case of JVM failure. You will need larger RAM and heap space allocation, but in today's time RAM is cheap. (We use 32 GB RAM and allocate 16 GB XMs and Xmx) Most likely, your JVM will never go down provided you have 24x7 server.

Being lazy I would go for the third approach. I will store all the events in a DB, Then I will process all the students in batch once per: day, week, month (as you need). That will allow you to create just one session with rules that covers multiple students, classes, etc. If you don't have 3+ Million students you will be fine and it will be a performant app.

Thanks for the suggestions and advice. I'm leaning towards #2 for a couple of reasons:

  • I think that gathering up all the facts for a given student to rerun them from scratch for every event that comes in will be a heavy-weight process that I'd rather avoid.
  • The use case I'm dealing with models as a very long-running monitoring process which leads me to believe (after reading the use cases for Fusion) that inserting events into a persistent KnowledgeSession is the way to go.
  • I don't think the scope of the problem space (i.e. a student vs a classroom, a school or the whole district) is a problem. If we need classroom metrics then we will just have a new rulebase for classes that also consumes the relevant events (tests, absences etc.)

The one caveat is that if the rules change, we need to re-evaluate all affected students against the new rulebase which means gathering up all the facts and starting from the beginning of the school year. This shouldn't happen much at all, but if it become more frequent then I might move to the 3rd approach.

Thanks again for the assisstance.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top