I have a rails app hosted with NGINX and Puma. Every 10 hours or so, the app becomes unusable. Whenever a user tries to connect, the following error message is displayed:
Error during failsafe response: could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)
This continues until the app is restarted.
I have read that this is because the database connection pool is full, and so there must be threads being created in the rails app that are not closing their connection to the database when they finish. To my knowledge, there is only one place in the app code where threads are used: one block uses the Ruby Timeout module, but this does not access the database.
Following this guide https://devcenter.heroku.com/articles/concurrency-and-database-connections (I am not actually using Heroku) I have set the size of the database connection pool to 5, with the following config file :
#config/initializers/database_connection.rb
Rails.application.config.after_initialize do
ActiveRecord::Base.connection_pool.disconnect!
ActiveSupport.on_load(:active_record) do
config = ActiveRecord::Base.configurations[Rails.env] ||
Rails.application.config.database_configuration[Rails.env]
config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
config['pool'] = ENV['MAX_THREADS'] || 5
ActiveRecord::Base.establish_connection(config)
end
end
The site is hosted using Rails 4.0.0. I have read that this may in fact be a Rails 4.0.0 problem instead, and that this was fixed in later versions, but am unsure of this. ConnectionTimeoutError on Heroku with Postgres
The rails app is running in the production environment. I can give more information on my Puma, NGINX config if needed.
The fact that the failsafe response is trying to allocate a database connection may be a smoking gun. It might help you could describe what happens in the failsafe response. The failsafe response was presumably triggered when the original request triggered an exception. The rails show_exception routine which calls the failsafe response is called after the ConnectionManager calls clear_active_connections! for the current request (which failed with an exception), which means that rails will not automatically release database connections after the failsafe response fails. This means that the failsafe response handler is responsible for cleaning up its own database connections. I'm not sure it's good practice for the failsafe response handler to be trying to connect to the database, but if that is the desired behavior, then you may have to call clear_active_connections! explicitly at the end of your failsafe handler (in an ensure block).
I've been investigating a similar problem in my own app and found this to be a useful guide to how connections work: https://bibwild.wordpress.com/2014/07/17/activerecord-concurrency-in-rails4-avoid-leaked-connections/. While the code referenced in here may need a few tweaks, there's a good outline in there of how to go about detecting when you create an implicit database connection.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With