Question

I need ideas to implement a (really) high performance in-memory Database/Storage Mechanism in Java. In the range of storing 20,000+ java objects, updated every 5 or so seconds.
Some options I am open to:

Pure JDBC/database combination

JDO

JPA/ORM/database combination

An Object Database

Other Storage Mechanisms

What is my best option? What are your experiences?

EDIT: I also need like to be able to Query these objects

Was it helpful?

Solution

You could try something like Prevayler (basically an in-memory cache that handles serialization and backup for you so data persists and is transactionally safe). There are other similar projects. I've used it for a large project, it's safe and extremely fast.

If it's the same set of 20,000 objects, or at least not 20,000 new objects every 5 seconds but lots of changes, you might be better off cacheing the changes and periodically writing the changes in batch mode (jdbc batch updates are much faster than individual row updates). Depends on whether you need each write to be transactionally wrapped, and whether you'll need a record of the change logs or just aggregate changes.

Edit: as other posts have mentioned Prevayler I thought I'd leave a note on what it does: Basically you create a searchable/serializable object (typically a Map of some sort) which is wrapped in a Prevayler instance, which is serialized to disk. Rather than making changes directly to your map, you make changes by sending your Prevayler instance a serializable record of your change (just an object that contains the change instruction). Prevayler's version of a transaction is to write your serialization changes to disk so that in the event of failure it can load the last complete backup and then replay the changes against that. It's safe, although you do have to have enough memory to load all of your data, and it's a fairly old API, so no generic interfaces, unfortunately. But definitely stable and works as advertised.

OTHER TIPS

I highly recommend H2. This is a kind of "second generation" version of HSQLDB done by one of the original authors. H2 allows us to unit-test our DAO layer without requiring an actual PostgreSQL database, which is awesome.

There is an active net group and mailing list, and the author Thomas Mueller is very responsive to queries (hah, little pun there.)

I don't know if it is the fastest option, but I've been very satisfied with H2 whenever I've used it. It's written by the same person who originally wrote Hypersonic (which later became HSQLDB).

Another option that is allegedly very fast is Prevayler.

It is a bit of an old question, but these days there is a whole lot of databases that have a level of performance of 20,000/s. Which database to chose depends on data structure and type of queries you'd like to be making. It also depends on overall volume.

We had similar problem with large volume of time series data, about 300,000 rec/s and we ended up writing a new database NFSdb, with simple enough API and decent performance. It can do about 2,000,000 object writes/s and we did away without ORM. Storage API looks something like:

JournalFactory factory = new JournalFactory("/mnt1/data/tick");

MyObject o = new MyObject();
try (JournalWriter<MyObject> writer = factory.writer(MyObject.class)) {

   o.setBlah(...);
   writer.append(o);

   // more appends here
   //
   writer.commit();
}

Try the following, it performs really well with Hibernate and other ORM frameworks

http://hsqldb.org/

I would give a try to OrientDB.

Chronicle Map is an embeddable pure Java persistent database, providing a simple java.util.Map interface. It withstands about 1 million queries/updates per second from a single thread, consistent read/write performance and scales almost linearly to the number of cores in the machine.

Here are some recent performance research with actual numbers:

Terracotta might also be an answer for you. It allows multiple VMs to share objects so you can distribute load etc...

You can also check out db4o

If you want to store all of your data in memory, you might want to look at Prevayler.

I've never used it myself, but it seems like a much better solution than using a relational database for those cases in which all of your data can be stored in memory.

Berkeley DB for Java is a fast in memory database, extremely useful for simple object graphs.

hsqldb is quite fast, but it is not ACID transaction-safe. The fastest java-database I know is db4o: benchmarks.

Edit: Please notice that Prevayler is not a database, see http://www.prevayler.org/wiki.jsp?topic=PrevaylerIsNotADatabase. If you're out of RAM, you're out of luck.

H2 is truly fantastic, indeed, in memory, normal server and transactional, you have it all. However It doesn't compare in performance to the object databases, I see Db4o mentioned, I have had much better performance with Neodatis in fact, and everything nicely set up in Maven repositories. Although not very robust, like a Ferrari, fast but not a truck like Oracle.

You can try CSQL (available under open source and enterprise version) It provides 30X performance improvement over disk based database systems and provides JDBC interface. It can be configured to work as stand alone main memory database or as a transparent cache to MySQL, Postgres, Oracle databases.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top