Question

Sorry that the title isn't exactly obvious, but I couldn't word it better.

We are right now using a conventional DB (oracle) as our job queue, and these "jobs" are consumed by some number of nodes (machines). So the DB server gets hit by these nodes, and we have to pay a lot for the software and hardware for this database server.

Now, it occurred to me the other day that,

1) There are already multiple nodes in the system
2) "Jobs" may not be lost because of node failures, but there is no reason they have to be sitting in a secondary storage (no reason why they couldn't reside in memory, as long as they are not lost)

Given this, couldn't one retain these jobs in-memory, making sure that at least n number of copies of this job is present in the entire cluster, thereby getting rid of the DB server?

Are such technologies available?

Was it helpful?

Solution

Did you take a look at Gigaspaces? On an internet scale, you do not need to persist at all. You just have to know sufficient copies are around. If you have low latency connections to places that are not on the same powergrid (or have battery power), pushing out your transactions to the duplicates is enough.

OTHER TIPS

If you're only looking at storing up to a few terabytes of data, and you're looking for redundancy vs. disk recoverability, then take a look at Oracle Coherence. For example:

  • Elastic. Just add nodes. Auto-discovery. Auto-load-balancing. No data loss. No interruption. Every time you add a node, you get more data capacity and more throughput.
  • Use both RAM and flash. Transparently. Easily handle 10s or even 100s of gigabytes per Coherence node (e.g. up to a TB or more per physical server).
  • Automatic high availability (HA). Kill a process, no data loss. Kill a server, no data loss.
  • Datacenter continuous availability (CA). Kill a data center, no data loss.

For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.

It depends on how much you expect these technologies to do for you. There are loads of basic in-memory databases (SQLite, Redis, etc) and you can use normal database replication techniques with multiple slaves in multiple data centers to pretty much ensure durability without persistence.

If you're storing in memory you're likely going to run out of space and require horizontal partitioning (sharding) and may want to check out something like VoltDB if you want to stick with SQL.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top