Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ruby hash memory leak after key deletion

Helo, i can't succeed how to release memory after key deletion in hash. When I delete key from Hash, memory is not released nor after calling GC.start manually. Is this expected behavior or GC does not release memory when keys are deleted from Hash and these objects are leaking somewhere? How can I delete key in Hash in Ruby and unallocate it also in memory?

Example:

irb(main):001:0> `ps -o rss= -p #{Process.pid}`.to_i
=> 4748
irb(main):002:0> a = {}
=> {}
irb(main):003:0> 1000000.times{|i| a[i] = "test #{i}"}
=> 1000000
irb(main):004:0> `ps -o rss= -p #{Process.pid}`.to_i
=> 140340
irb(main):005:0> 1000000.times{|i| a.delete(i)}
=> 1000000
irb(main):006:0> `ps -o rss= -p #{Process.pid}`.to_i
=> 140364
irb(main):007:0> GC.start
=> nil
irb(main):008:0> `ps -o rss= -p #{Process.pid}`.to_i
=> 127076

PS: I use ruby 1.8.7. I've tried also ruby 1.9.2, but it was not better.

like image 834
Adam Kliment Avatar asked May 11 '11 18:05

Adam Kliment


3 Answers

See Stackoverflow: How Malloc and Free Work

For a variety of good reasons (spelled out in the citation above) virtually no memory manager releases memory back to the operating system.

For you to see the process change, the underlying malloc and free in the C part of the Ruby interpreter would need to give the host OS back the memory. That's not going to happen, but at the Ruby level, the objects have been GC'ed and are in a locally kept free list in the interpreter.

like image 68
DigitalRoss Avatar answered Oct 14 '22 18:10

DigitalRoss


As a senior developer who's been doing this stuff a long time in a lot of languages, here's my thoughts:

While I think your intention is good when using a compiled language such as C, fine-grained developer-controlled memory management doesn't fit into how languages like Ruby, Python and Perl do things.

Scripting languages like Perl, Ruby and Python insulate us from the worries of memory management. That's one of the reasons we like them. If we have the memory available, they'll use what they need in order to get the job done. The loss of memory management control is a tradeoff for speed of development and ease of debugging; We don't have it, and don't need to worry about it. If I need it I'll use C or assembly language.

As far as assuming it's a memory leak, well, I think that's a bit naive or presumptuous. A memory leak like you mention would be a significant leak, and, with as many Ruby-based apps and sites as there are, someone would have noticed it a long time ago. So, as a sanity check for myself when I see something that doesn't make sense, I always figure I'm doing something wrong in my code first, then I'll take a look at my assumptions about how something works, and if those still seem sound, I'll go looking for other people who have similar problems and see if they have solutions. And, if the problem is something that would be core to the language, I'll dig into the source or talk to some of the core developers and ask if I'm nuts with what I'm seeing. I've found low-level bugs before but they've been corner cases, and I spent a couple days digging around before I mentioned anything, because I didn't want to be like a peer of mine who'd file a bug report with Apple immediately, then find out it was a bug in his code.

My overall thinking regarding returning memory back to the system on a deallocation, is it incurs additional overhead that might be reversed in the next operation wasting CPU cycles, which interpreted and scripting languages can not afford since they're not as fast as compiled languages to begin with. I think it's a fair thing for the language to assume that it will need to repeatedly allocate a big block of memory if it's had to do it once, especially with an OO language like Ruby. At that point it makes a lot of sense to hold on to the memory previously used.

And, in the big scheme of things, allocating 1,000,000 array elements of that size isn't a lot of memory considering how much we routinely have free in our boxes. I would be more concerned about the need to maintain 1,000,000 elements in an array in memory and would recommend to a peer that they should look seriously at using a database. You might have a sound business reason for holding it all in RAM. If so, max out the RAM on the host and you should be fine.

like image 39
the Tin Man Avatar answered Oct 14 '22 18:10

the Tin Man


The objects should be garbage collected. If you were to create them again, the process should not grow significantly, because it has all that empty space. However, Ruby does not release that memory back to the OS, because it assumes that it is likely to need that much memory again in the future.

This is a rather simplistic explanation, but basically, what you are seeing is normal.

like image 21
Austin Taylor Avatar answered Oct 14 '22 18:10

Austin Taylor