I have a worker process which spawns up to 50 threads and do some async operations (most of which are http calls). When I start up the process, it starts with some 35MB of used memory, and quickly grows to 250MB. From that point it grows further more and the problem is that the memory never stops growing (even though the growing phase decreases over time). After several days, process just outgrows the available memory and crashes.
I did a lot of analysis and profiling and can't seem to find what is wrong. Process memory is constantly growing, even though the heap size is pretty much constant. I've collected GC.stat
output into spreadsheet that you can access here:
https://docs.google.com/spreadsheets/d/17TohDNXQ_MXM31CeAmR2ptHFYfvOeF3dB6WCBkBS_Bc/edit?usp=sharing
Even though it seems that the process memory has finally stabilized at 415MB, it will continue to grow over next couple of days until it reaches 512MB limit and crashes.
I've also tried tracking objects with objectspace, but the sum of memory of tracked objects never crosses 70-80MB which perfectly aligns with GC reports. Where are the remaining 300MB+ (and growing) spent.. i have no clue.
How to deal with these kinds of problems? Are there any tools that could give me clearer insight on how the memory is being consumed?
UPDATE: Gems and OS
I'm using following gems:
gem "require_all", "~> 1.3"
gem "thread", "~> 0.1"
gem "equalizer", "~> 0.0.9"
gem "digest-murmurhash", "~> 0.3", require: "digest/murmurhash"
gem "google-api-client", "~> 0.7", require: "google/api_client"
gem "aws-sdk", "~> 1.44"
The application is deployed on heroku, though memory leak is noticable when running it locally on Mac OS X 10.9.4.
UPDATE: Leaks
I've upgraded stringbuffer and analyzed everything like @mtm suggested and now there are no memory leaks identified by leak
tool, no increases in ruby heap size over time, and yet, the process memory is still growing. Originally I thought that it stopped growing at some point, but several hours later it outgrew the limit and process crashed.
Slow Release. Another important cause of memory bloat in Ruby is a slow release of freed memory back to the system. In this situation, memory is freed much more slowly than the rate at which new memory blocks are allocated to objects.
Memory is "leaking" out of the heap, never to be seen again. You allocate the memory, work with it until the program terminates. This is not a memory leak; it doesn't impair the program, and all the memory will be scavenged up automagically when the program terminates. Generally, you should avoid memory leaks.
OSX has a useful dev tool called leaks that can tell you if it finds unreferenced memory in a running process. Not very useful for identifying where the memory comes from in Ruby but will help to identify when it is occurring. First to be tested is digest-stringbuffer. Grab the example from the Readme and add in some GC logging with gc_tracer
Isolating c-extension memory leaks often involves valgrind and custom compiled versions of Ruby that support debugging with valgrind. It is tremendously hard work we only want to deal with as a last resort.
At some point in the life of every Rails developer you are bound to hit a memory leak. It may be tiny amount of constant memory growth, or a spurt of growth that hits you on the job queue when certain jobs run.
When dealing with memory leaks in c-extensions, having this information is critical. Isolating c-extension memory leaks often involves valgrind and custom compiled versions of Ruby that support debugging with valgrind. It is tremendously hard work we only want to deal with as a last resort.
From your GC logs it appears the issue is not a ruby object reference leak as the heap_live_slot
value is not increasing significantly. That would suggest the problem is one of:
It's interesting to note that the problem exhibits on both OSX and Heroku (Ubuntu Linux).
Ruby 2.1 garbage collection uses the reported "heap" only for Objects that contain a tiny amount of data. When the data contained in an Object goes over a certain limit, the data is moved and allocated to an area outside of the heap. You can get the overall size of each data type with ObjectSpace:
require 'objspace'
ObjectSpace.count_objects_size({})
Collecting this along with your GC stats might indicate where memory is being allocated outside the heap. If you find a particular type, say :T_ARRAY
increasing a lot more than the others you might need to look for an array you are forever appending to.
You can use pry-byebug
to drop into a console to troll around specific objects, or even looking at all objects from the root:
ObjectSpace.memsize_of(some_object)
ObjectSpace.reachable_objects_from_root
There's a bit more detail on one of the ruby developers blog and also in this SO answer. I like their JRuby/VisualVM profiling idea.
Use bundle
to install your gems into a local path:
bundle install --path=.gems/
Then you can find those that include native code:
find .gems/ -name "*.c"
Which gives you: (in my order of suspiciousness)
OSX has a useful dev tool called leaks
that can tell you if it finds unreferenced memory in a running process. Not very useful for identifying where the memory comes from in Ruby but will help to identify when it is occurring.
First to be tested is digest-stringbuffer
. Grab the example from the Readme and add in some GC logging with gc_tracer
require "digest/stringbuffer"
require "gc_tracer"
GC::Tracer.start_logging "gclog.txt"
module Digest
class Prime31 < StringBuffer
def initialize
@prime = 31
end
def finish
result = 0
buffer.unpack("C*").each do |c|
result += (c * @prime)
end
[result & 0xffffffff].pack("N")
end
end
end
And make it run lots
while true do
a=[]
500.times do |i|
a.push Digest::Prime31.hexdigest( "abc" * (1000 + i) )
end
sleep 1
end
Run the example:
bundle exec ruby ./stringbuffertest.rb &
pid=$!
Monitor the resident and virtual memory sizes of the ruby
process, and the count of leaks
identified:
while true; do
ps=$(ps -o rss,vsz -p $pid | tail +2)
leaks=$(leaks $pid | grep -c Leak)
echo "$(date) m[$ps] l[$leaks]"
sleep 15
done
And it looks like we've found something already:
Tue 26 Aug 2014 18:22:36 BST m[104776 2538288] l[8229]
Tue 26 Aug 2014 18:22:51 BST m[110524 2547504] l[13657]
Tue 26 Aug 2014 18:23:07 BST m[113716 2547504] l[19656]
Tue 26 Aug 2014 18:23:22 BST m[113924 2547504] l[25454]
Tue 26 Aug 2014 18:23:38 BST m[113988 2547504] l[30722]
Resident memory is increasing and the leaks tool is finding more and more unreferenced memory. Confirm the GC heap size, and object count looks stable still
tail -f gclog.txt | awk '{ print $1, $3, $4, $7, $13 }
1581853040832 468 183 39171 3247996
1581859846164 468 183 33190 3247996
1584677954974 469 183 39088 3254580
1584678531598 469 183 39088 3254580
1584687986226 469 183 33824 3254580
1587512759786 470 183 39643 3261058
1587513449256 470 183 39643 3261058
1587521726010 470 183 34470 3261058
Then report the issue.
It appears to my very untrained C eye that they allocate both a pointer and a buffer but only clean up the buffer.
Looking at digest-murmurhash
, it seems to only provide functions that rely on StringBuffer so the leak might be fine once stringbuffer is fixed.
When they have patched it, test again and move onto the next gem. It's probably best to use snippets of code from your implementation for each gem test rather than a generic example.
First step would be to prove the issue on multiple machines under the same MRI to rule out anything local, which you've already done.
Then try the same Ruby version on a different OS, which you've done too.
Try the code on JRuby or Rubinius if possible. Does the same issue occur?
Try the same code on 2.0 or 1.9 if possible, see if the same problem exists.
Try the head development version from github and see if that makes any difference.
If nothing becomes apparent, submit a bug to Ruby detailing the issue and all the things you have eliminated. Wait for a dev to help out and provide whatever they need. They will most likely want to reproduce the issue so if you can get the most concise/minimal example of the issue set up. Doing that will often help you identify what the issue is anyway.
I fix memory leak and release digest/stringbuffer v0.0.3.
https://rubygems.org/gems/digest-stringbuffer
You can try again by v0.0.3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With