Java NIO - Memory mapped files

Question 1

There are several ways.

Let the writer acquire an exclusive Lock on the region that has not been written yet. Release the lock when everything has been written. This is compatible to every other application running on that system but it requires the reader to be smart enough to retry on failed reads unless you combine it with one of the other methods
Use another communication channel, e.g. a pipe or a socket or a file’s metadata channel to let the writer tell the reader about the finished write.

Write at a position in the file a special marker (being part of the protocol) telling about the written data, e.g.

MappedByteBuffer bb;
…
// write your data

bb.force();// ensure completion of all writes
bb.put(specialPosition, specialMarkerValue);
bb.force();// ensure visibility of the marker

Question 2

I do a lot of work with memory-mapped files for interprocess communication. I would not recommend Holger's #1 or #2, but his #3 is what I do. But a key point is perhaps that I only ever work with a single writer - things get more complicated if you have multiple writers.

The start of the file is a header section with whatever header variables you need, most importantly a pointer to the end of the written data. The writer should always update this header variable after writing a piece of data, and the reader should never read beyond this variable. A thing called "cache coherency" that all mainstream CPU's have will guarantee that the reader will see memory writes in the same sequence they are written, so the reader will never read uninitialised memory if you follow these rules. (An exception is where the reader and writers are on different servers - cache coherency doesn't work there. Don't try to implement shared memory across different servers!)

There is no limit to how frequently you can update the end-of-file pointer - it's all in memory and there won't be any i/o involved, so you can update it each record or each message you write.

ByteBuffer has versions of 'getInt()' and 'putInt()' methods which take an absolute byte offset, so that's what I use for reading & writing the end-of-file marker...I never use the relative versions when working with memory-mapped files.

There's no way you should use the file size or yet another interprocess method to communicate the end-of-file marker and no need or benefit when you already have shared memory.

Question 3

Check out my library Mappedbus (http://github.com/caplogic/mappedbus) which enables multiple Java processes (JVMs) to write records in order to the same memory mapped file.

Here's how Mappedbus solves the synchronization problem between multiple writers:

The first eight bytes of the file make up a field called the limit. This field specifies how much data has actually been written to the file. The readers will poll the limit field (using volatile) to see whether there's a new record to be read.
When a writer wants to add a record to the file it will use the fetch-and-add instruction to atomically update the limit field.
When the limit field has increased a reader will know there's new data to be read, but the writer which updated the limit field might not yet have written any data in the record. To avoid this problem each record contains an initial byte which make up the commit field.
When a writer has finished writing a record it will set the commit field (using volatile) and the reader will only start reading a record once it has seen that the commit field has been set.

(BTW, the solution has only been verified to work on Linux x86 with Oracle's JVM. It most likely won't work on all platforms).