Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is ActiveSupport::JSON.encode so slow?

JRuby 9.2, Rails 4.2

I've been work with reading/dumping large JSON objects, 100+MB files, which would routinely hang for a few minutes then run out of memory when writing them out.

I noticed it specifically happens when using to_json. Below is with a 4MB file so it actually completes:

pry(main)> f = File.open('nvdcve-1.0-recent.json'){|f| JSON.load(f) }
pry(main)> puts Benchmark.measure{ JSON.dump(f) }
  0.230000   0.010000   0.240000 (  0.081894)
=> nil
pry(main)> puts Benchmark.measure{ f.to_json }
  1.020000   0.020000   1.040000 (  0.820851)
=> nil

The source of the problem being that ActiveSupport is overriding to_json on a bunch of things:

pry(main)> f.method(:to_json)
=> #<Method: Hash#to_json(to_json_with_active_support_encoder)>
pry(main)> puts Benchmark.measure{ f.to_json_without_active_support_encoder }
  0.040000   0.000000   0.040000 (  0.035408)
pry(main)> puts Benchmark.measure{ f.to_json_with_active_support_encoder }
  1.170000   0.010000   1.180000 (  0.812674)

See here: https://apidock.com/rails/Object/to_json_with_active_support_encoder

So what exactly is ActiveSupport::JSON.encode doing differently from the JSON gem's to_json that is causing it to be so much slower, even on an unremarkable JSON file that was loaded from a JSON file to begin with?

Would I be breaking any Rails functionality if I overrode to_json to not use the ActiveSupport variant?

like image 227
Karl Avatar asked Aug 02 '19 22:08

Karl


2 Answers

According to the source ActiveSupport::JSON is pretty much just an additional layer on top of the JSON gem - which actually does most of the heavy lifting. What it mainly does is:

Escape some additional characters which can be problematic in certain browsers.

ESCAPED_CHARS = {
            "\u2028" => '\u2028',
            "\u2029" => '\u2029',
            ">"      => '\u003e',
            "<"      => '\u003c',
            "&"      => '\u0026',
            }

The comments don't state which the problematic browsers in question are and this might be a total non-issue for modern browsers.

Casts objects to suitable JSON representations:

      # Convert an object into a "JSON-ready" representation composed of
      # primitives like Hash, Array, String, Numeric,
      # and +true+/+false+/+nil+.
      # Recursively calls #as_json to the object to recursively build a
      # fully JSON-ready object.
      #
      # This allows developers to implement #as_json without having to
      # worry about what base types of objects they are allowed to return
      # or having to remember to call #as_json recursively.
      #
      # Note: the +options+ hash passed to +object.to_json+ is only passed
      # to +object.as_json+, not any of this method's recursive +#as_json+
      # calls.
      def jsonify(value)
        case value
        when String
          EscapedString.new(value)
        when Numeric, NilClass, TrueClass, FalseClass
          value.as_json
        when Hash
          Hash[value.map { |k, v| [jsonify(k), jsonify(v)] }]
        when Array
          value.map { |v| jsonify(v) }
        else
          jsonify value.as_json
        end
      end

This is probably the most important part as the #as_json is pretty much key to how models present themselves as JSON. It also handles casting dates and timestamps and other non-primitives (ActiveSupport provides #as_json methods for these). Removing this would most likely break a lot of things.

Depending your use case you could perhaps manually create ruby objects (arrays, hashes, etc) from your model objects and use JSON.generate(data) - I'm not sure it will perform better though.

like image 52
max Avatar answered Nov 20 '22 01:11

max


tl;dr: ActiveSupport::JSON#encode is doing three extra things.

  1. Calling as_json asking each object to provide a data structure suitable for serialization.
  2. "jsonifying" that data into JSON primitives and handling recursion.
  3. Escaping HTML in strings.

JSON.dump decides how things will become JSON, you have no control. The as_json callback allows you to control exactly how your objects are dumped. For simple objects this adds a bit of overhead. For everything else it's very important.

jsonify makes dumping objects easier. It chews up about 29% of the time.

This allows developers to implement #as_json without having to worry about what base types of objects they are allowed to return or having to remember to call #as_json recursively.

Escaping HTML allows one to just plop JSON into an HTML page. That's about 38% of the time. That's of limited use for chewing up so much time. Unfortunately it can't be easily turned off. That might be a worthwhile patch.

Just 10% is spent inside the real JSON encoder. This roughly concurs with my Benchmark results.

[12] pry(main)> puts Benchmark.measure{ ActiveSupport::JSON.encode(h) }
  1.241299   0.009459   1.250758 (  1.253938)
=> nil
[13] pry(main)> puts Benchmark.measure{ JSON.dump(h) }
  0.117118   0.003333   0.120451 (  0.120881)

Note: This is with Ruby 2.6.3 and Rails 5.2.3. That's what I've got, and performance issues with old versions are less interesting. You can run them yourself, I expect it to be about the same.

Running this with ruby-prof we can see that JSON.dump is spending all its time in just a few functions.

$ rails -e 'require "json"; require "ruby-prof"; h = File.open("nvdcve-1.0-recent.json"){|f| JSON.load(f) }; result = RubyProf.profile { JSON.dump(h) }; printer = RubyProf::GraphPrinter.new(result); printer.print(STDOUT, {})' > json.prof
------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.275      0.134      0.000      0.141              1/1     <Module::JSON>#generate
  99.98%  48.78%      0.275      0.134      0.000      0.141                1     JSON::Ext::Generator::State#generate 
                      0.097      0.097      0.000      0.000    401143/401143     String#encode
                      0.021      0.021      0.000      0.000    101047/101047     Hash#keys
                      0.021      0.021      0.000      0.000    213358/213358     String#to_s
                      0.001      0.001      0.000      0.000        2058/2058     Float#to_s
------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.097      0.097      0.000      0.000    401143/401143     JSON::Ext::Generator::State#generate
  35.26%  35.26%      0.097      0.097      0.000      0.000           401143     String#encode                  
------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.021      0.021      0.000      0.000    101047/101047     JSON::Ext::Generator::State#generate
   7.79%   7.79%      0.021      0.021      0.000      0.000           101047     Hash#keys                      
------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.021      0.021      0.000      0.000    213358/213358     JSON::Ext::Generator::State#generate
   7.77%   7.77%      0.021      0.021      0.000      0.000           213358     String#to_s                    
------------------------------------------------------------------------------------------------------------------------------------------------------
  • 49% actually generating JSON.
  • 8% iterating through hash keys.
  • 35% in String#encode, presumably to ensure all strings are UTF-8 as required.
  • 8% stringifying things.

Now with ActiveSupport::JSON.encode(h). For some reason h.to_json doesn't go through encode from the command line, but it does in console.

$ rails -e 'require "json"; require "ruby-prof"; h = File.open("nvdcve-1.0-recent.json"){|f| JSON.load(f) }; result = RubyProf.profile { ActiveSupport::JSON.encode(h) }; printer = RubyProf::GraphPrinter.new(result); printer.print(STDOUT, {})' > as.prof

It's profile is more complicated. Starting from the top.

------------------------------------------------------------------------------------------------------------------------------------------------------
                      3.020      0.000      0.000      3.020              1/1     <Module::ActiveSupport::JSON>#encode
  96.58%   0.00%      3.020      0.000      0.000      3.020                1     ActiveSupport::JSON::Encoding::JSONGemEncoder#encode /Users/schwern/.rvm/gems/ruby-2.6.3/gems/activesupport-5.2.3/lib/active_support/json/encoding.rb:34
                      1.483      0.000      0.000      1.483              1/1     ActiveSupport::JSON::Encoding::JSONGemEncoder#stringify
                      0.894      0.000      0.000      0.894         1/519065     ActiveSupport::JSON::Encoding::JSONGemEncoder#jsonify
                      0.643      0.000      0.000      0.643         1/101047     Hash#as_json
                      0.000      0.000      0.000      0.000         1/305710     Kernel#dup
------------------------------------------------------------------------------------------------------------------------------------------------------

Its spending 95% of its time in ActiveSupport::JSON::Encoding::JSONGemEncoder#encode.

def encode(value)
  stringify jsonify value.as_json(options.dup)
end

value.as_json asks the value to return a data structure it wants serialized as JSON. This is scattered amongst different methods.

These don't take up a lot of time themselves, but they do recurse eating up up 21% of the total time.

------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.634      0.009      0.000      0.625      9800/101047     Hash#each
                      0.642      0.075      0.000      0.567     91246/101047     Array#map
                      0.643      0.000      0.000      0.643         1/101047     ActiveSupport::JSON::Encoding::JSONGemEncoder#encode
  20.56%   2.69%      0.643      0.084      0.000      0.559           101047    *Hash#as_json                   /Users/schwern/.rvm/gems/ruby-2.6.3/gems/activesupport-5.2.3/lib/active_support/core_ext/object/json.rb:157
                      0.643      0.031      0.000      0.612    101047/202094     Enumerable#map
                      0.036      0.036      0.000      0.000    101047/202094     <Class::Hash>#[]
------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.643      0.003      0.000      0.639      10311/10311     Hash#each
  20.56%   0.11%      0.643      0.003      0.000      0.639            10311    *Array#as_json                  /Users/schwern/.rvm/gems/ruby-2.6.3/gems/activesupport-5.2.3/lib/active_support/core_ext/object/json.rb:151
                      0.643      0.051      0.000      0.592      10311/20623     Array#map
------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.000      0.000      0.000      0.000      1102/187785     Array#map
                      0.033      0.033      0.000      0.000    186683/187785     Hash#each
   1.05%   1.05%      0.033      0.033      0.000      0.000           187785     String#as_json                 /Users/schwern/.rvm/gems/ruby-2.6.3/gems/activesupport-5.2.3/lib/active_support/core_ext/object/json.rb:86
------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.001      0.001      0.000      0.000        2058/4116     Hash#each
                      0.001      0.001      0.000      0.000        2058/4116     ActiveSupport::JSON::Encoding::JSONGemEncoder#jsonify
   0.05%   0.04%      0.002      0.001      0.000      0.000             4116     Float#as_json                  /Users/schwern/.rvm/gems/ruby-2.6.3/gems/activesupport-5.2.3/lib/active_support/core_ext/object/json.rb:106
                      0.000      0.000      0.000      0.000        4116/4116     Float#finite?
------------------------------------------------------------------------------------------------------------------------------------------------------

jsonify converts the as_json structure into JSON primitives. "This allows developers to implement #as_json without having to worry about what base types of objects they are allowed to return or having to remember to call #as_json recursively." This eats up about 29% of the total time.

------------------------------------------------------------------------------------------------------------------------------------------------------
                      0.894      0.107      0.000      0.787     92348/519065     Array#map
                      0.894      0.224      0.000      0.670    426716/519065     Hash#each
                      0.894      0.000      0.000      0.894         1/519065     ActiveSupport::JSON::Encoding::JSONGemEncoder#encode
  28.60%  10.59%      0.894      0.331      0.000      0.563           519065    *ActiveSupport::JSON::Encoding::JSONGemEncoder#jsonify /Users/schwern/.rvm/gems/ruby-2.6.3/gems/activesupport-5.2.3/lib/active_support/json/encoding.rb:85
                      0.894      0.036      0.000      0.858    101047/202094     Enumerable#map
                      0.894      0.029      0.000      0.866      10311/20623     Array#map
                      0.193      0.129      0.000      0.064    401143/401153     Class#new
                      0.130      0.130      0.000      0.000  1104275/1104283     Module#===
                      0.031      0.031      0.000      0.000    101047/202094     <Class::Hash>#[]
                      0.001      0.001      0.000      0.000        2058/4116     Float#as_json
                      0.001      0.001      0.000      0.000        2533/5066     FalseClass#as_json
                      0.000      0.000      0.000      0.000        1973/3946     TrueClass#as_json
------------------------------------------------------------------------------------------------------------------------------------------------------
  • stringify hands all that to JSON.generate.
------------------------------------------------------------------------------------------------------------------------------------------------------
                      1.483      0.000      0.000      1.483              1/1     ActiveSupport::JSON::Encoding::JSONGemEncoder#encode
  47.42%   0.00%      1.483      0.000      0.000      1.483                1     ActiveSupport::JSON::Encoding::JSONGemEncoder#stringify /Users/schwern/.rvm/gems/ruby-2.6.3/gems/activesupport-5.2.3/lib/active_support/json/encoding.rb:101
                      1.483      0.000      0.000      1.483              1/1     <Module::JSON>#generate
------------------------------------------------------------------------------------------------------------------------------------------------------

JSON.generate calls JSON::Ext::Generator::State#generate which eats 47% of the total time. But 38% is spent in ActiveSupport::JSON::Encoding::JSONGemEncoder::EscapedString#to_json, this method escapes HTML. That means the time spent actually generating JSON is just 10% of the run time. This jives with the Benchmark results.

------------------------------------------------------------------------------------------------------------------------------------------------------
                      1.483      0.228      0.000      1.254              1/1     <Module::JSON>#generate
  47.42%   7.31%      1.483      0.228      0.000      1.254                1     JSON::Ext::Generator::State#generate 
                      1.176      0.271      0.000      0.905    401143/401143     ActiveSupport::JSON::Encoding::JSONGemEncoder::EscapedString#to_json
                      0.040      0.040      0.000      0.000    213358/213358     ActiveSupport::JSON::Encoding::JSONGemEncoder::EscapedString#to_s
                      0.037      0.037      0.000      0.000    101047/101047     Hash#keys
                      0.001      0.001      0.000      0.000        2058/2058     Float#to_s
------------------------------------------------------------------------------------------------------------------------------------------------------
                      1.176      0.271      0.000      0.905    401143/401143     JSON::Ext::Generator::State#generate
  37.63%   8.68%      1.176      0.271      0.000      0.905           401143     ActiveSupport::JSON::Encoding::JSONGemEncoder::EscapedString#to_json /Users/schwern/.rvm/gems/ruby-2.6.3/gems/activesupport-5.2.3/lib/active_support/json/encoding.rb:55
                      0.614      0.209      0.000      0.405    401143/401143     ActiveSupport::ToJsonWithActiveSupportEncoder#to_json
                      0.292      0.292      0.000      0.000    401143/401143     String#gsub
------------------------------------------------------------------------------------------------------------------------------------------------------```
like image 27
Schwern Avatar answered Nov 20 '22 00:11

Schwern