Consider this snippet:
l = []
while 1
l << 'a random 369-characterish string'
end
^C
# ran this for maybe 4 seconds, and it had 27 million entries in l. memory
# usage was 1.6 GB.
l = nil
# no change in memory usage
GC.start
# memory usage drops a relatively small amount, from 1.6 GB to 1.39 GB.
I am pushing millions of elements into/through Ruby's data structures, and having some serious memory issues. This example demonstrates that even in a case where there is no reference to an extant object, Ruby will not let [most of] it go, even after an explicit call to GC.start
.
The objects I am using in real life push millions of elements into a hash in total, but the hash is used as a temporary lookup table and is zeroed out after some loop has completed. The memory from this lookup table, however, apparently never gets released, and this slows my application horrendously and progessively because the GC has millions of defunct objects to analyze on each cycle. I am working on a workaround with the sparsehash
gem, but this doesn't seem like an intractable problem that the Ruby runtime should choke on like that. The references are clearly deleted, and the objects should clearly be collected and disposed. Can anyone help me figure out why this is not happening?
I have tried l.delete_if { |x| true}
on the suggestion of a user in #ruby on freenode, but this was really slow and also never seemed to cause an appreciable release of memory.
Using ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux]
.
EDIT:
For comparison, here is a run in python3
:
l = []
while 1:
l.append('a random 369-characterish string')
^C
# 31,216,082 elements; 246M memory usage.
l = []
# memory usage drops to 8K (0% of system total)
Test on python2 show near identical results.
I'm not sure if this is sufficient to consider this an implementation deficiency in MRI or if it just chalked up to different approaches to GC. Either way, it seems like Python is better suited to use cases that are going to push millions of elements in total through the data structures and periodically zero the structures out (like a one may do for a temporary look up table).
It really does seem like this should be a simple one. :\
Trimming to Fix Ruby Memory Bloat You need to override the garbage collection process and release memory more often to fix slow memory release. There is an API that can do this called malloc_trim. All you need to do is modify Ruby to call this function during the garbage collection process.
The allocate_memory() function will leak memory because it's using malloc & it doesn't call free to release that memory. This kind of leak won't show up on any heap dump or on GC. stat , but you will see memory usage grow.
Finding Leaks in Ruby. Detecting a leak is simple enough. You can use GC , ObjectSpace , and the RSS graphs in your APM tool to watch your memory usage increase.
Kind of hacky, but you can try fork
ing the operation off as a separate process. The process will run in shared memory space; when it terminates, the memory will be freed.
Ruby might not be releasing memory back to the Kernel as @Sergio Tulentsev pointed out in the comments.
This Ruby/Unix mailing list conversation describes this in detail: Avoiding system calls
Also, this blog post describes forking as a solution for memory management in Rails: Saving memory in Ruby on Rails with fork() and copy-on-write. Although, I don't think that Ruby will support copy-on-write until Ruby 2 comes out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With