Why does repeated JSON parsing consume more and more memory?

Question

It seems that parsing the same JSON file over and over again in Ruby uses increasingly larger amounts of memory. Consider the code and the output below:

Why isn't the memory freed up after the first iteration?
Why does a 116MB JSON file take up 1.5Gb of RAM after parsing? It's surprising considering the text file is converted into hashes. What am I missing here?

Code:

require 'json'

def memused
  `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1]/1024
end

text = IO.read('../data-grouped/2012-posts.json')
puts "before parsing: #{memused}MB"
iter = 1
while true
  items = JSON.parse(text)
  GC.start
  puts "#{iter}: #{memused}MB"
  iter += 1
end

Output:

before parsing: 116MB
1: 1840MB
2: 2995MB
3: 2341MB
4: 3017MB
5: 2539MB
6: 3019MB

Thiago Lewin · Accepted Answer

When Ruby parses a JSON file, it creates many intermediate objects to achieve the goal. These objects stays on memory until GC start working.

If the JSON file has a complicated structure, many arrays and inner objects, the number will grow fast too.

Did you try to call "GC.start" to suggest Ruby clean up unused memory? If the amount of memory decrease significantly, its suggest that is just intermediate objects used to parse the data, otherwise, your data structure is complex or there is something your data that the lib can't deallocate.

For large JSON processing I use yajl-ruby (https://github.com/brianmario/yajl-ruby). It is C implemented and has a low footprint.

Why does repeated JSON parsing consume more and more memory?

Tags:

json

memory

memory-leaks

ruby

vrepsys

1 Answers

Thiago Lewin

Recent Activity

Donate For Us

Why does repeated JSON parsing consume more and more memory?

Tags:

json

memory

memory-leaks

ruby

vrepsys

1 Answers

Thiago Lewin

Related questions

Recent Activity

Donate For Us