Question

I understand that capacity is the number of elements or available spaces in an ArrayList that may or may not hold a value referencing an object. I am trying to understand more about the concept of capacity.

So I have three questions:

1) What are some good ways to define what capacity represents from a memory standpoint?

...the (contiguous?) memory allocated to the ArrayList?

...the ArrayLists’s memory footprint on the (heap?)?

2) Then if the above is true, changing capacity requires some manner of memory management overhead?

3) Anyone have an example where #2 was or could be a performance concern? Aside from maybe a large number of large ArrayLists having their capacities continually adjusted?

Was it helpful?

Solution

  1. The class is called ArrayList because it's based on an array. The capacity is the size of the array, which requires a block of contiguous heap memory. However, note that the array itself contains only references to the elements, which are separate objects on the heap.
  2. Increasing the capacity requires allocating a new, larger array and copying all the references from the old array to the new one, after which the old one becomes eligible for garbage collection.
  3. You've cited the main case where performance could be a concern. In practice, I've never seen it actually become a problem, since the element objects usually take up much more memory (and possibly CPU time) than the list.

OTHER TIPS

ArrayList is implemented like this:

class ArrayList {
  private Object[] elements;
}

the capacity is the size of that array.

Now, if your capacity is 10, and you're adding 11-th element, ArrayList will do this:

Object[] newElements = new Object[capacity * 1.5];
System.arraycopy(this.elements, newElements);
this.elements = newElements;

So if you start off with a small capacity, ArrayList will end up creating a bunch of arrays and copying stuff around for you as you keep adding elements, which isn't good.

On the other hand, if you specify a capacity of 1,000,000 and add only 3 elements to ArrayList, that also is kinda bad.

Rule of thumb: if you know the capacity, specify it. If you aren't sure but know the upper bound, specify that. If you just aren't sure, use the defaults.

Capacity is as you described it -- the contiguous memory allocated to an ArrayList for storage of values. ArrayList stores all values in an array, and automatically resizes the array for you. This incurs memory management overhead when resizing.

If I remember correctly, Java increases the size of an ArrayList's backing array from size N to size 2N + 2 when you try to add one more element than the capacity can take. I do not know what size it increases to when you use the insert method (or similar) to insert at a specific position beyond the end of the capacity, or even whether it allows this.

Here is an example to help you think about how it works. Picture each space between the |s as a cell in the backing array:

| | |

size = 0 (contains no elements), capacity = 2 (can contain 2 elements).

|1| |

size = 1 (contains 1 element), capacity = 2 (can contain 2 elements).

|1|2|

size = 2, capacity = 2. Adding another element:

|1|2|3| | | |

size increased by 1, capacity increased to 6 (2 * 2 + 2). This can be expensive with large arrays, as allocating a large contiguous memory region can require a bit of work (as opposed to a LinkedList, which allocates many small pieces of memory) because the JVM needs to search for an appropriate location, and may need to ask the OS for more memory. It is also expensive to copy a large number of values from one place to another, which would be done once such a region was found.

My rule of thumb is this: If you know the capacity you will require, use an ArrayList because there will only be one allocation and access is very fast. If you do not know your required capacity, use a LinkedList because adding a new value always takes the same amount of work, and there is no copying involved.

1) What are some good ways to define what capacity represents from a memory standpoint?

...the (contiguous?) memory allocated to the ArrayList?

Yes, an ArrayList is backed up by an array, to that represents the internal array size.

...the ArrayLists’s memory footprint on the (heap?)?

Yes, the larget the array capacity, the more footprint used by the arraylist.

2) Then if the above is true, changing capacity requires some manner of memory management overhead?

It is. When the list grows large enough, a larger array is allocated and the contents copied. The previous array maybe discarded and marked for garbage collection.

3) Anyone have an example where #2 was or could be a performance concern? Aside from maybe a large number of large ArrayLists having their capacities continually adjusted?

Yes, if you create the ArrayList with initial capacity of 1 ( for instance ) and your list grows way beyond that. If you know upfront the number of elements to store, you better request an initial capacity of that size.

However I think this should be low in your list of priorities, while array copy may happen very often, it is optimized since the early stages of Java, and should not be a concern. Better would be to choose a right algorithm, I think. Remember: Premature optimization is the root of all evil

See also: When to use LinkedList over ArrayList

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top