in my current project I am processing a quite big amount of data and the processing of the data should be both memory efficient and computationally performant. Every item has some meta-data that can be read very fast and is almost always interesting. Additionally to that every item has the actual data that is comparatively rarely read but the reading and especially the parsing is very time consuming. Therefore it seams natural that the parsing of the data should only be done if it is actually requested.
For that purpose I was thinking of lazy values:
class Item(metaData: MetaData, dataString: String) {
lazy val data = parse(dataString)
}
Now the data is only parsed if it is actually requested. The problem is now, that the dataString and the parsed data is kept in memory. As far as I can see, "dataString" cannot be accessed anymore as soon as "data" has been called (or is there?) and it can therefore be garbage collected. Unfortunately this seams not to happend.
Is there a way to solve the problem in a different way or to give the garbage collector a hint to garbage collect the dataString here?
Drawbacks of garbage collection in Java Garbage collectors bring some runtime overhead that is out of the programmer's control. This could lead to performance problems for large applications that scale large numbers of threads or processors, or sockets that consume a large amount of memory.
An object is eligible for garbage collection when there are no more references to that object. References that are held in a variable are usually dropped when the variable goes out of scope. Or, you can explicitly drop an object reference by setting the variable to the special value null.
One such improvement suggested by the Garbage Collection Handbook is lazy sweeping. The basic idea is that instead of having the collector thread sweep the entire heap at once when tracing is finished, each mutator thread will sweep its own heap incrementally as part of allocation.
When the garbage collector performs a collection, it releases the memory for objects that are no longer being used by the application. It determines which objects are no longer being used by examining the application's roots.
You just need a little bit more tooling:
class Item(dataString: String) {
private var storedData = dataString
lazy val data = {
val temp = parse(storedData)
storedData = null
temp
}
}
An extra reference to dataString
is not kept because you never refer to it outside of the constructor (which sets storedData
), and the reference you store in storedData
is nulled out once you use it, so the string is then free to be GCed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With