Why java has “String” type and not “string”?

https://stackoverflow.com/questions/2108828

22-09-2019
|

Question

Wrapper class are just fine and their purpose is also well understood. But why do we omit the primitive type ?

Solution

It depends what you mean by "primitive"

"Primitive" in Java is usually taken to mean "value type". However, C# has a string keyword, which acts exactly the same as Java's String, it's just highlighted differently by the editor. They are aliases for the classes System.String or java.lang.String. String is not a value type in either language, so in this way it's not a primitive.

If by "primitive" you mean built into the language, then String is a primitive. It just uses a capital letter. String literals (those things in quotes) are automatically converted to System.String and + is used for concatenation. So by this token, they (and Arrays) are as primitive as ints, longs, etc.

First, what is a String?

String is not a wrapper. String is a reference type, while primitive types are value types. The means that if you have:

int x = 5;
int y = x;

The memory of x and y both contain "5". But with:

String x = "a";
String y = x;

The memory of x and y both contain a pointer to the character "a" (and a length, an offset, a ClassInfo pointer, and a monitor). Strings behave like a primitive because they're immutable, so it's usually not an issue, however if you, say, used reflection to change the contents of the string (don't do this!), both x and y would see the change. In fact if you have:

char[] x = "a".toCharArray();
char[] y = x;
x[0] = 'b';
System.out.println(y[0] == 'b'); // prints "true"

So don't just use char[] (unless this is the behavior you want, or you're really trying to reduce memory usage).

Every Object is a reference type -- that means all classes you write, every class in the framework, and even arrays. The only things that are value types are the simple numeric types (int, long, short, byte, float, double, char, bool, etc.)

Why isn't String mutable like char[]?

There are a couple reasons for this, but it mostly comes down to psychology and implementation details:

Imagine the chaos you'd have if you passed a string into another function and that function changed it somehow. Or what if it saved it somewhere and changed it in the future? With most reference types, you accept this as part of the type, but the Java developers decided that, at least for strings, they didn't want users to have to worry about that.
Strings can't be dealt with atomically, meaning multithreading/synchronization would become an issue.
String literals (the things you put in your code in quotes) might be immutable at the computer's level¹ (for security reasons). This could be gotten around by copying them all into another part of memory when the program starts up or using copy-on-write, but that's slow.

Why don't we have a value-type version of a string?

Basically, performance and implementation details, as well as the complexity of having 2 different string types. Other value types have a fixed memory footprint. An int is always 32 bits, a long is always 64 bits, a bool is always 1 bit, etc.² Among other things, this means that they can be stored on the stack, so that all parameters to a function live in one place. Also, making gigantic copies of strings all over the place would kill performance.

See also: In C#, why is String a reference type that behaves like a value type?. Refers to .NET, but this is just as applicable in Java.

_{1 - In C/C++ and other natively-compiled languages, this is true because they are placed in the code segment of the process, which the OS usually stops you from editing. In Java, this is actually usually untrue, since the JVM loads the class files onto the heap, so you could edit a string there. However, there's no reason a Java program couldn't be compiled natively (there are tools which do this), and some architectures (notably some versions of ARM) do directly execute Java bytecode.}

_{2 - In practice, some of these types are a different size at the machine level. E.x. bools are stored as WORD-size on the stack (32 bits on x86, 64 bits on x64). In classes/arrays they may be treated differently. This is all an implementation detail that's left up to the JVM -- the spec says bools are either true or false and the machine can figure out how to do it.}

OTHER TIPS

The primitive type for String is char[].

This is true for many languages (C, Java, C#, C++ and many more...).

strings could be of arbitrary length. the fathers of java did not want to have a primitive type for which they could not assign a concrete memory size. this is one of the chief reasons string is not a primitive in java.

String is sort of a special case. All the real primitive types (int, long, etc) are pass-by-value, and implemented directly in the JVM. String is a reference type, and so dealt with like any other class (capital letter, pass-by-reference...), except the compiler has special hooks to deal with it like a built-in type (+ for string concatentation, for example).

As it is already a reference type, it does not need a wrapper class like Integer to be able to use it as a class (in collections, for example)

Primitive?

If Java there's no primitive for strings. The primitives are int, float, double, boolean, etc... and char.

So for using strings they've used an object. You instance it, it lives in the heap, you have a reference to it, etc.

How did they implement it? Saving the value it represents in a char array.

Inmutability

But they ensured inmutability. When you have a reference to a String object you know you can pass it freely to other objects knowing the value pointed by that reference will not change. All methods that modifies strings returns other instance of the string so it doesn't change the value represented by other references to String.

Can it be other way (like in .Net)

Yes. They could have defined a reserved word string and the compiler do the transformation.

But they didn't...

a String is an array of char. As it is an array, it cannot be a primitive ! :-)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow