Question

We have a a single relational database table which accumulates client's orders. Orders have different types and parameters as well as their their lifecycle (e.g. new, amend, update, cancel etc.).

Requirement:

  1. There are about 3 millions of orders coming in within a day.
  2. Peak rate is 400 new orders per second - it's a high insert rate.
  3. The growing database has 100 million orders currently. Of course they can be queried and aggregated.
  4. Hundreds of consuming applications around the world requiring to:
    • Get a set of orders in date/time range by a filter from the database.
    • Subscribe by a filter/Listen continuously to incoming updates of orders. (A filter is a set of conditions orders should meet because each application is interested in a particular set of data).

We need a server-side application which meets all the above requirements.

Problems:

  1. We cannot achieve real-time requirements for No. 1 (<100 ms latency for each order). SQL-queries are extremely slow due to frequent table inserts/updates and changeable select-statements, query-aggregations etc. in RDBMS.
  2. We don't have a flexible filter for subscription at the moment either.
  3. Current solution is not scalable. RDBMS is the main bottleneck.

I would be grateful to hear architectural and technological ideas. Thank you.

Was it helpful?

Solution

Distributed cache

Hazelcast is a distributed cache implementation and a free open-source alternative to Oracle Coherence that also supports Write-behind and Continuous Query mechanisms, which in combination can solve your three problems.

  1. Use write-behind for asynchronous database writes through cache.
  2. Use Continuous Queries to receive real-time updates for each cache put operation.
  3. Hazelcast is distributed and it's very easy to start a cluster. With write-behind your scalability is limited by the number of hazelcast nodes and the amount of available memory.

Messaging

Messaging middleware is another option that can be used to address your problems.

  1. Leverage asynchronous queues/topics to offload database. The queues/topics would be logically partitioned in a flexible way to be able to tune load on the database. This will require development of a separate layer between messaging and the database.
  2. Again, use topics to subscribe for incoming orders to create an effect of continuous querying (the layer that consumes orders and persists them to the db would be responsible for sending updates).
  3. Many messaging broker implementations also support clustering/distribution for additional parallelisation and scalability (e.g., HornetQ).

Note that with messaging you can have high availability, reliability and scalability at the same time.

Database level

Assuming you are using Oracle as your database, you have at least the following options for your problems 1-3:

  1. Partitioning of tables by region/date range/client name/other category - whatever fits your particular case - will give you some room for scaling your reads/writes. If you have a lot of CPU cores available to Oracle, proper partitioning may increase performance of queries drastically on big datasets leveraging the level of parallelisation (in one project we achieved improvement from 5 hours to 10 minutes for processing of a huge dataset).
  2. An INSERT trigger in combination with Oracle Advanced Queuing can be used to implement continuous querying (although, an external messaging broker may work better in some cases).
  3. See #1.

In fact, you can think of combinations of the described above approaches to achieve the best performance and scalability for your specific case.

I have not considered migration to a NoSQL datastore, as IMO, NoSQL solutions are not the best fit for applications where data consistency is critical, which I assume is your case.

OTHER TIPS

Maybe CQRS design pattern is suitable in this scenario
see here, Martin Fowler's article

The idea is to separate Selects and Insert/Updates commands

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top