Question

I have read a lot of conflicting articles regarding memory allocation when String is created. Some articles say that new operator creates a String in heap and String literal is created in String Pool [Heap] while some say that new operator creates an object in heap and another object in String pool.

In order to analyse this i wrote the below program which prints the hashcode of String char array and String object:

import java.lang.reflect.Field;

public class StringAnalysis {

    private int showInternalCharArrayHashCode(String s)
            throws SecurityException, NoSuchFieldException,
            IllegalArgumentException, IllegalAccessException {
        final Field value = String.class.getDeclaredField("value");
        value.setAccessible(true);
        return value.get(s).hashCode();
    }

    public void printStringAnalysis(String s) throws SecurityException,
            IllegalArgumentException, NoSuchFieldException,
            IllegalAccessException {
        System.out.println(showInternalCharArrayHashCode(s));

        System.out.println(System.identityHashCode(s));

    }

    public static void main(String args[]) throws SecurityException,
            IllegalArgumentException, NoSuchFieldException,
            IllegalAccessException, InterruptedException {
        StringAnalysis sa = new StringAnalysis();
        String s1 = new String("myTestString");
        String s2 = new String("myTestString");
        String s3 = s1.intern();
        String s4 = "myTestString";

        System.out.println("Analyse s1");
        sa.printStringAnalysis(s1);

        System.out.println("Analyse s2");
        sa.printStringAnalysis(s2);

        System.out.println("Analyse s3");
        sa.printStringAnalysis(s3);

        System.out.println("Analyse s4");
        sa.printStringAnalysis(s4);

    }

}

This program prints following output:

Analyse s1
1569228633
778966024
Analyse s2
1569228633
1021653256
Analyse s3
1569228633
1794515827
Analyse s4
1569228633
1794515827

From this output one thing is very clear that irrespective of how String is created, if Strings have same value then they share same char array.

Now my question is where is this chararray stored , is it stored in heap or it goes to permgen? Also i want to understand how to diferentiate between heap memory addresses and permgen memory addresses.

I have a big issue if it is stored in permgen as it will eat up my precious limited permgen space. and if char array is not stored in permgen but in heap then does it imply that String literals also use heap space [which is something i have never read] .

Was it helpful?

Solution 2

From this output one thing is very clear that irrespective of how String is created, if Strings have same value then they share same char array

Not quite: this is happening because you start with one literal string, and create multiple instances from it. In the OpenJDK (Sun/Oracle) implementation, the backing array will be copied if it represents the entire string. You can see this in src.jar, or here: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java#String.%3Cinit%3E%28java.lang.String%29

If you carefully construct your source strings such that they start from different character arrays, you'll find that they don't share the backing array.

Now my question is where is this chararray stored

To the best of my knowledge, the character array for a string literal is stored on the heap (those with better knowledge of classloading internals, feel free to comment). Strings loaded from files will always store their backing arrays on the heap.

What I do know for sure is that the data structure used by intern() only references the String object, not its character array.

OTHER TIPS

From String src

 public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }

it's clear that the string created with this constructor shares the char array (value) with the original string.

It's important to note that the API does not guarantee this sharing:

Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string. Unless an explicit copy of original is needed, use of this constructor is unnecessary since Strings are immutable

For example, String.substring used to share char array with the original string, but in latest versions of Java 1.7 String.substring makes a copy of char array.

Last first: By definition, the literal "myTestString" is interned, and all interned String references with the same value refer to the same physical String object. So the literal will be the EXACT SAME STRING as the result from intern.

[Corrected] By definition, the hashCode (but not the identityHashCode) of two Strings with identical character sequence values will be identical.

The hashCode of a char[] array, on the other hand, is simply a jumble of its address bits and bears no relation to the contents of the array. This indicates that the value array is, in all above cases, the exact same array.

(Further info: The old implementation of String included a pointer to a char[], an offset, a length, and a hashCode value. Newer implementations deprecate the offset value, with the String value beginning with element 0 of the array. Other (non-Sun/non-Oracle) implementations do away with the separate char[] array and include the String bytes inside the main heap allocation. There is no requirement that the value field actually exist.)

[Continued] Copied over the test case and added a few lines. hashCode and identityHashCode produce the same values on a given char[], but produce different values on different arrays with the same contents.

The fact that the arrays are identical in s1 and s2 is almost certainly because they are sharing the char[] array of the interned literal "myTestString". If the Strings were separately constructed from "fresh" char[] arrays they would be different.

The main take-away from all this is that String literals are interned, and the implementation being tested "borrows" the array of the source when a String is copied with new String(String).

Char array hash codes
a1.hashCode() = 675303090
a2.hashCode() = 367959235
a1 identityHashCode = 675303090
a2 identityHashCode = 367959235
Strings from char arrays
a1 String = ABCDE
a1 String's hash = 62061635
a1 String value's identityHashCode = 510044439
a2 String = ABCDE
a2 String's hash = 62061635
a2 String value's identityHashCode = 1709651096
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top