So my app exports a 11.5 MB CSV file and uses basically all of the RAM that never gets freed.
The data for the CSV is taken from the DB, and in the case mentioned above the whole thing is being exported.
I am using Ruby 2.4.1 standard CSV library in the following fashion:
export_helper.rb
:
CSV.open('full_report.csv', 'wb', encoding: UTF-8) do |file|
data = Model.scope1(param).scope2(param).includes(:model1, :model2)
data.each do |item|
file << [
item.method1,
item.method2,
item.methid3
]
end
# repeat for other models - approx. 5 other similar loops
end
and then in the controller:
generator = ExportHelper::ReportGenerator.new
generator.full_report
respond_to do |format|
format.csv do
send_file(
"#{Rails.root}/full_report.csv",
filename: 'full_report.csv',
type: :csv,
disposition: :attachment
)
end
end
After a single request the puma processes load 55% of the whole server's RAM and stay like that until eventually run out of memory completely.
For instance in this article generating a million-lines 75 MB CSV file required only 1 MB of RAM. But there is no DB querying involved.
The server has 1015 MB RAM + 400 MB of swap memory.
So my questions are:
Thanks in advance!
Instead of each
you should be using find_each
, which is specifically for cases like this, because it will instantiate the Models in batches and release them afterwards, whereas each
will instantiate all of them at once.
CSV.open('full_report.csv', 'wb', encoding: UTF-8) do |file|
Model.scope1(param).find_each do |item|
file << [
item.method1
]
end
end
Furthermore you should stream the CSV instead of writing it to memory or disk before sending it to the browser:
format.csv do
headers["Content-Type"] = "text/csv"
headers["Content-disposition"] = "attachment; filename=\"full_report.csv\""
# streaming_headers
# nginx doc: Setting this to "no" will allow unbuffered responses suitable for Comet and HTTP streaming applications
headers['X-Accel-Buffering'] = 'no'
headers["Cache-Control"] ||= "no-cache"
# Rack::ETag 2.2.x no longer respects 'Cache-Control'
# https://github.com/rack/rack/commit/0371c69a0850e1b21448df96698e2926359f17fe#diff-1bc61e69628f29acd74010b83f44d041
headers["Last-Modified"] = Time.current.httpdate
headers.delete("Content-Length")
response.status = 200
header = ['Method 1', 'Method 2']
csv_options = { col_sep: ";" }
csv_enumerator = Enumerator.new do |y|
y << CSV::Row.new(header, header).to_s(csv_options)
Model.scope1(param).find_each do |item|
y << CSV::Row.new(header, [item.method1, item.method2]).to_s(csv_options)
end
end
# setting the body to an enumerator, rails will iterate this enumerator
self.response_body = csv_enumerator
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With