Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent string being interned

Tags:

c#

memory

My understanding (which may be wrong) is that in c# when you create a string it gets interned into "intern pool". That keeps a reference to strings so that multiple same strings can share the operating memory.

However I am processing a lot of strings which are very likely unique, and I need to completely remove them from operating memory once I am done with each of them and I am not sure how the cached reference is going to be removed so that garbage collector can just remove all the string data from memory. How can I prevent the string from being interned in this cache, or how can I clear it / or remove a string from it so that it surely get removed from operating memory?

like image 772
Petr Avatar asked Apr 26 '13 09:04

Petr


People also ask

Do interned strings get garbage collected?

Manually interned strings will be garbage-collected. String literals will be only garbage collected if the class that defines them is unloaded.

Are all strings interned in Java?

The distinct values are stored in a string intern pool. The single copy of each string is called its intern and is typically looked up by a method of the string class, for example String. intern() in Java. All compile-time constant strings in Java are automatically interned using this method.

What does string intern () method do?

intern() The method intern() creates an exact copy of a String object in the heap memory and stores it in the String constant pool. Note that, if another String with the same contents exists in the String constant pool, then a new object won't be created and the new reference will point to the other String.

Where are interned strings stored?

Intern strings are stored in a string pool in the JVM memory. JVM Memory has the following regions: Heap region (i.e. Young & Old generation) Metaspace.


2 Answers

If you need to remove the strings from memory for security reasons, use SecureString.

Otherwise, if there are no references to the string anywhere, the GC will clean it up anyway (it will no longer be interned) so you don't need to worry about interning.

And of course, only string literals are interned in the first place (or if you call String.Intern() as noted above by Petr and others).

like image 171
Matthew Watson Avatar answered Oct 31 '22 07:10

Matthew Watson


You are saying to things:

  • You are processing a lot of strings, so you are talking about runtime values.
  • You want to remove the strings from memory after you are done processing them.

By default, runtime values are NOT interned. When you receive a string from a file or create a string yourself, they all have a separate instance. You can Intern them via String.Intern. Interning strings takes more time, but consumes less memory. See: http://msdn.microsoft.com/en-us/library/system.string.intern.aspx

Runtime strings are automatically removed by the GC if there is no reference to them. An interned will have more references, but at the end of your process, I assume that all references are removed. The interning-mechanism does not keep a HARD reference, but a WEAK reference. A weak reference is ignore by the GC, so the string instance can still be removed. See: http://msdn.microsoft.com/en-us/library/system.weakreference.aspx

So... to sum it up. By default your runtime strings are not interned. And if they would be interned, they are still removed by the GC after your work is done.

like image 23
Martin Mulder Avatar answered Oct 31 '22 06:10

Martin Mulder