I've seen many primitive examples describing how String intern()'ing works, but I have yet to see a real-life use-case that would benefit from it.
The only situation that I can dream up is having a web service that receives a considerable amount of requests, each being very similar in nature due to a rigid schema. By intern()'ing the request field names in this case, memory consumption can be significantly reduced.
Can anyone provide an example of using intern() in a production environment with great success? Maybe an example of it in a popular open source offering?
Edit: I am referring to manual interning, not the guaranteed interning of String literals, etc.
intern() The method intern() creates an exact copy of a String object in the heap memory and stores it in the String constant pool. Note that, if another String with the same contents exists in the String constant pool, then a new object won't be created and the new reference will point to the other String.
A few of its imperative applications are Spell Checkers, Spam Filters, Intrusion Detection System, Search Engines, Plagiarism Detection, Bioinformatics, Digital Forensics and Information Retrieval Systems etc.
Java String intern() method explained with examples This method ensures that all the same strings share the same memory. For example, creating a string “hello” 10 times using intern() method would ensure that there will be only one instance of “Hello” in the memory and all the 10 references point to the same instance.
String Interning is a method of storing only one copy of each distinct String Value, which must be immutable. By applying String. intern() on a couple of strings will ensure that all strings having the same contents share the same memory.
Interning can be very beneficial if you have N
strings that can take only K
different values, where N
far exceeds K
. Now, instead of storing N
strings in memory, you will only be storing up to K
.
For example, you may have an ID
type which consists of 5 digits. Thus, there can only be 10^5
different values. Suppose you're now parsing a large document that has many references/cross references to ID
values. Let's say this document have 10^9
references total (obviously some references are repeated in other parts of the documents).
So N = 10^9
and K = 10^5
in this case. If you are not interning the strings, you will be storing 10^9
strings in memory, where lots of those strings are equals
(by Pigeonhole Principle). If you intern()
the ID
string you get when you're parsing the document, and you don't keep any reference to the uninterned strings you read from the document (so they can be garbage collected), then you will never need to store more than 10^5
strings in memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With