Our ruby on rails site has a URI that one of our partners POSTs XML data to.
Since we don't want to deal with XML, we literally just stuff the raw data into a database column and don't go any further with processing it.
However, one of the posts we received gave us this error in airbrake:
ArgumentError: invalid %-encoding ("http://ns.hr-xml.org/2004-08-02" 
userId="" password=""><BackgroundReportPackage type="report">
<ProviderReferenceId>....
With backtrace:
vendor/ruby-1.9.3/lib/ruby/1.9.1/uri/common.rb:898:in `decode_www_form_component'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/utils.rb:41:in `unescape'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/utils.rb:94:in `block (2 levels) in parse_nested_query'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/utils.rb:94:in `map'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/utils.rb:94:in `block in parse_nested_query'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/utils.rb:93:in `each'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/utils.rb:93:in `parse_nested_query'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/request.rb:332:in `parse_query'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/request.rb:209:in `POST'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/methodoverride.rb:26:in `method_override'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/methodoverride.rb:14:in `call'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/runtime.rb:17:in `call'
vendor/bundle/ruby/1.9.1/gems/activesupport-3.2.13/lib/active_support/cache/strategy/local_cache.rb:72:in `call'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/lock.rb:15:in `call'
vendor/bundle/ruby/1.9.1/gems/actionpack-3.2.13/lib/action_dispatch/middleware/static.rb:63:in `call'
vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:136:in `forward'
vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:143:in `pass'
vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:155:in `invalidate'
vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:71:in `call!'
vendor/bundle/ruby/1.9.1/gems/rack-cache-1.2/lib/rack/cache/context.rb:51:in `call'
vendor/bundle/ruby/1.9.1/gems/railties-3.2.13/lib/rails/engine.rb:479:in `call'
vendor/bundle/ruby/1.9.1/gems/railties-3.2.13/lib/rails/application.rb:223:in `call'
vendor/bundle/ruby/1.9.1/gems/railties-3.2.13/lib/rails/railtie/configurable.rb:30:in `method_missing'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/deflater.rb:13:in `call'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/content_length.rb:14:in `call'
vendor/bundle/ruby/1.9.1/gems/railties-3.2.13/lib/rails/rack/log_tailer.rb:17:in `call'
vendor/bundle/ruby/1.9.1/gems/thin-1.4.1/lib/thin/connection.rb:80:in `block in pre_process'
vendor/bundle/ruby/1.9.1/gems/thin-1.4.1/lib/thin/connection.rb:78:in `catch'
vendor/bundle/ruby/1.9.1/gems/thin-1.4.1/lib/thin/connection.rb:78:in `pre_process'
vendor/bundle/ruby/1.9.1/gems/thin-1.4.1/lib/thin/connection.rb:53:in `process'
vendor/bundle/ruby/1.9.1/gems/thin-1.4.1/lib/thin/connection.rb:38:in `receive_data'
vendor/bundle/ruby/1.9.1/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine'
vendor/bundle/ruby/1.9.1/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run'
vendor/bundle/ruby/1.9.1/gems/thin-1.4.1/lib/thin/backends/base.rb:63:in `start'
vendor/bundle/ruby/1.9.1/gems/thin-1.4.1/lib/thin/server.rb:159:in `start'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/handler/thin.rb:13:in `run'
vendor/bundle/ruby/1.9.1/gems/rack-1.4.5/lib/rack/server.rb:268:in `start'
vendor/bundle/ruby/1.9.1/gems/railties-3.2.13/lib/rails/commands/server.rb:70:in `start'
vendor/bundle/ruby/1.9.1/gems/railties-3.2.13/lib/rails/commands.rb:55:in `block in <top (required)>'
vendor/bundle/ruby/1.9.1/gems/railties-3.2.13/lib/rails/commands.rb:50:in `tap'
vendor/bundle/ruby/1.9.1/gems/railties-3.2.13/lib/rails/commands.rb:50:in `<top (required)>'
script/rails:6:in `require'
script/rails:6:in `<main>'
The issue is that the POST contains the data:
<ChargeOrComplaint>DRIVE WHILE BLOOD ALCOHOL LEVEL IS 0.08% OR MORE</ChargeOrComplaint>
Presumably this is valid XML, but the naked % at the end of 0.08% is causing the error, since it's coming via HTTP and I guess rack is expecting it to be URL encoded.
The backtrace indicates this is happening before it even gets to our code, so I don't think it has anything to do with how we're processing it.
My questions, then:
1) Where does the issue lie? Ruby 1.9.3's implementation of decode_www_form_component (at the top of the stack trace)? Rack? Our partner's POST data or headers? Our handling of the POST?
2) Does XML data POSTed via HTTP need to be URL encoded?
3) Is there a header that this POST needs to have for Rack to interpret it correctly? (i.e.: that it's XML binary data, and not URL encoded).
4) If I can't get our partner to change what they're posting to us, how could we work around it? Some Rack middleware?
I'm guessing your partner is probably POSTing the data to you as "x-www-form-urlencoded", making Rack try to parse it that way. If they can change what they're sending, I suspect making their content-type "text/xml" will fix this.
If you can't get them to change what they send, then yes, I think you'd have to use Rack middleware (or monkeypatching). Although you could poke around the Rack source, maybe there's a setting to avoid doing any parsing.
There's some debate in various forums as to where the responsibility lies for catching invalid encoding errors in request body content, but neither rack nor rails handles it, both leaving it to the app to handle. To work around invalid %-encoding in POST data in my app, I used a similar solution to this related question: Rails ArgumentError: invalid %-encoding
I added this middleware in app/middleware/invalid_post_data_interceptor.rb to intercept invalid post data:
class InvalidPostDataInterceptor
  def initialize(app)
    @app = app
  end
  def call(env)
    request_content = Rack::Request.new(env).POST rescue :bad_form_data
    headers = {'Content-Type' => 'text/plain'}
    if request_content == :bad_form_data
      [400, headers, ['Bad Request']]
    else
      @app.call(env)
    end
  end
end
Then added it to the middleware stack by adding this to application.rb:
config.middleware.insert_before Rack::Runtime, "InvalidPostDataInterceptor"
In my case the reason was extra newlines after the headers and before the request body. I'm guessing there is Content-Length inconsistency which throws off the parser. If you are setting headers programmatically make sure they don't have trailing newlines.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With