Question

My goal is to ensure that an array allocated in java is allocated across contiguous physical memory. The issue that I've run into is that the pages an array is allocated across tend not to be contiguous in physical memory, unless I allocate a really large array.

My questions are:

  • Why does a really large array ensure pages which are contiguous in physical memory?
  • Is there any way to ensure an array is allocated across physical memory, that doesn't involve making the array really large?
  • How can I tell what page or physical address a Java object/array exists in, without measuring cache hits/cache misses?

I'm not looking for answers asking why I am doing this in java. I understand that C would "solve my problem", and that I'm going against the fundamental nature of java. Nevertheless I have a good reason for doing this.

The answers need not be guaranteed to work all the time. I am looking for answers that work most of the time. Extra points for creative, out-of-the-box answers that no reasonable Java programmer would ever write. It's OK to be platform specific(x86 32-bit 64-bit).

Was it helpful?

Solution

Given that the garbage collector moves objects around in (logical) memory, I think you are going to be out of luck.

About the best you could do is use ByteBuffer.allocateDirect. That will (typically) not get moved around (logical) memory by the GC, but it may be moved in physical memory or even paged out to disc. If you want any better guarantees, you'll have to hit the OS.

Having said that, if you can set the page size to be as big as your heap, then all arrays will necessarily be physically contiguous (or swapped out).

OTHER TIPS

No. Physically contiguous memory requires direct interaction with the OS. Most applications, JVM included only get virtually contiguous addresses. And a JVM cannot give to you what it doesn't get from the OS.

Besides, why would you want it? If you're setting up DMA transfers, you probably are using techniques besides Java anyway.

Bit of background:

Physical memory in a modern PC is typically a flexible amount, on replacable DIMM modules. Each byte of it has a physical address, so the Operating System during boot determines which physical addresses are available. It turns out applications are better off by not using these addresses directly. Instead, all modern CPUs (and their caches) use virtual addresses. There is a mapping table to physical addresses, but this need not be complete - swap to disk is enabled by the use of virtual addresses not mapped to physical addresses. Another level of flexibility is gained from having one table per process, with incomplete mappings. If process A has a virtual address that maps to physical address X, but process B doesn't, then there is no way that process B can write to physical address X, and we can consider that memory to be exclusive to process A. Obviously for this to be safe, the OS has to protect access to mapping table, but all modern OSes do.

The mapping table works at the page level. A page, or contiguous subset of physical addresses is mapped to a contiguous subset of virtual addresses. The tradeoff between overhead and granularity has resulted in 4KB pages being a common page size. But as each page has its own mapping, one cannot assume contiguity beyond that page size. In particular, when pages are evicted from physical memory, swapped to disk, and restored, it's quite possible that the end up at a new physical memory address. The program doesn't notice, as the virtual address does not change, only the OS-managed mapping table does.

I would think that you would want to use sun.java.unsafe.

There may be ways to trick a specific JVM into doing what you want, but these would probably be fragile, complicated and most likely very specific to the JVM, its version, OS it runs on etc. In other words, wasted effort.

So without knowing more about your problem, I don't think anyone will be able to help. There certainly is no way to do it in Java in general, at most on a specific JVM.

To suggest an alternative:

If you really need to store data in contiguous memory, why not do it in a small C library and call that via JNI?

As I see it. You have yet to explain why

  • that primitive arrays are not continuous in memory. I don't see why they wouldn't be continuous in virtual memory. (c.f. Arrays of Object are unlikely have its Objects continuous in memory)
  • an array which is not continuous in physical memory (RAM i.e. Random Access Memory) would have a significant performance difference. e.g. measurable difference in the performance of your application.

What its appears is you are really looking for a low level way to allocate arrays because you are used to doing this in C, and performance is a claim for a need to do this.

BTW: Accessing ByteBuffer.allocateDirect() with say getDouble()/putDouble() can be slower that just using a double[] as the former involves JNI calls and the latter can be optimised to no call at all.

The reason it is used is for exchanging data between the Java and C spaces. e.g. NIO calls. It only performs well when read/writes are kept to a minimum. Otherwise you are better off using something in the Java space.

i.e. Unless you are clear what you are doing and why you are doing it, you can end up with a solution which might make you feel better, but actually is more complicated and performs worse than the simple solution.

Note this answer to a related question, which discusses System.identityHashCode() and identification of the memory address of the object. The bottom line is that you can use the default array hashCode() implementation to identify the original memory address of the array (subject to fitting in an int/32-bit)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top