Pergunta

Trying to understand strings better in C#

Are these assertions correct?

  • string is immutable
  • string is reference type but behaves like value type

For these code samples...

string x = "one"   

(creates string instance holding "one" in memory)

x = "two"   

(destroys "one" instance and creates new string instance holding "two" in memory, even though it is using the same variable x)

If the above are correct, what happens in a string array when one index value changes?

string[] array = new string[2];

array[0] = "boo";   (string "boo" created and held in 0 index)
array[1] = "shoo";

array[0] = "moo";  

Does the last assignment create an entire new array to change boo to moo? My best "guess" is that the array holds pointers so that array[0] simply points to the new string instance that holds "moo". Is this correct? If not, could someone please clarify, thanks.

Foi útil?

Solução

You have to be careful with some of the assumptions you are making about strings. The CLR has a "a table, called the intern pool, which contains a single instance of each unique literal string constant declared in a program". So what is really happening in your examples is:

string x = "one" // x references the "one" constant in the intern pool
x = "two" // x now references the "two" constant in the intern pool

array[0] = "boo"; // array[0] references the "boo" constant in the intern pool
array[1] = "shoo"; // array[1] references the "shoo" constant in the intern pool
array[0] = "moo"; // array[0] now references the "moo" constant in the intern pool

Outras dicas

A string is immutable, correct. It does behave somewhat like a (built in) value type, but this is mostly a consequence of both being immutable.

An array is not immutable, even if the elements it holds are immutable. So if an array slot holds a string, you can change it to a different string. This does not change any of the strings, it just changes the array. So yes, your guess is correct.

A string is fundamentally a reference type, which means the variable x does not actually contain a string, it contains a reference to string. When you set the variable to a different string, this only changes which reference the variable holds, but both strings are unaffected. Same thing with an array of strings - it is not really an array of strings but rather an array of references to strings. When you change an array item, you only change the cell to contain a different reference, but none of the actual strings are affected.

So it is not correct to say the string is destroyed when a variable is assigned a different reference. The string is unaffected. (Due to garbage collection, objects for which no references exists anymore may be removed from memory, but this is unpredictable and implementation specific.)

By the way, none of this is affected by the fact that strings are immutable! What you describe is fundamental for how reference types work, and since you are only changing references and not the string/object itself, it doesn't matter if it is immutable or not.

string x = "one"   

(creates string instance holding "one" in memory)

Sure. And assign a reference to that instance to variable x.

x = "two"   

destroys "one" instance

Nope. Nothing is destroyed here. Don't think of strings as things that can be destroyed. And remember that the purpose of garbage collection is to free you from having to reason about destruction semantics in 99.9% of cases.

creates new string instance holding "two" in memory, even though it is using the same variable x

Better would be to say "and assigns a reference to that instance to variable x".

What happens in a string array when one index value changes?

Same thing.

string[] array = new string[2];
array[0] = "boo";   

string "boo" created and held in 0 index)

Yep.

array[1] = "shoo";
array[0] = "moo";  

Does the last assignment create an entire new array to change boo to moo?

Nope.

My best "guess" is that the array holds pointers so that array[0] simply points to the new string instance that holds "moo". Is this correct?

Yes!

But think of them as references and not pointers. C# has pointers, which are seldom used and rather different. As an implementation detail, references are of course pointers behind the scenes, but don't think of them as pointers. Think of them as something that has reference semantics, because that is what you are guaranteed to get.

string x = "one" (creates string instance holding "one" in memory)

Maybe. If there is another "one" elsewhere, the runtime may reuse the existing instance.

x = "two" (destroys "one" instance and creates new string instance holding "two" in memory, even though it is using the same variable x)

Maybe. See above about instance reuse. Also, nothing will be destroyed immediately - the garbage collector still deals with strings.

If the above are correct, what happens in a string array when one index value changes?

Same as above - it removes a reference to the string instance which may make it available for garbage collection.

Licenciado em: CC-BY-SA com atribuição
scroll top