Pergunta

The reference is here: http://msdn.microsoft.com/en-us/library/system.string.intern.aspx

Looks like this is done automatically by the compiler a lot, but can also be done manually. Please correct me if I am wrong and shed some more light on this. Does it matter whether the language is C#, VB.Net, C++/CLI, other?

Thanks.

Foi útil?

Solução

I have done this is deserialization/materialization code when there is a good chance of repeated values (almost an enum, but not quite). When deserializing thousands of records this can give a significant memory benefit. However, in such cases you might prefer to use a separate intern cache, to avoid saturatig the shared one (or maybe the shared one is fine; it depends on the scenario).

But the key point there is: a scenario where you are likely to have lots and lots of different string instances with the same value. Deserialization is a big candidate there. It should also be note that there is some CPU overhead in checking the interned cache (progressively more overhead as you add data), so this should obly be done if there is a chance that the constucted objects are goin to live more than gen-0; if they are always going to be collected quickly anyway then it isn't worth swapping them for interned versions.

Outras dicas

It's a good idea to do so when profiling shows that it gives performance benefits.

It is done by the runtime, but a language could introduce its own string type with a different behavior. It is only done for literal strings. If you want to intern dynamically created strings, you can do so. For one thing it makes comparing strings really simple, but keep in mind that while some operations will benefit from interning others will not. E.g. interned strings are not released until process shutdown (as they are rooted by the internal structure, see this question for details), so if you intern a lot of strings manually, the process will carry around a lot of memory.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top