Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Queuing Requests(delayed job) on downloading large CSV size file as background process

This works fine in my Controller.

def export_list_sites_as_csv
    require "csv"
    csv_string = CSV.generate do |csv|
      csv << ["id","name", 'etc']
      @search.relation.not_archived.each do |site|
        csv << [site.id, site.name, site.etc]
      end
    end
    send_data csv_string,
              :type => 'text/csv',
              :filename => '_sites.csv',
              :disposition => 'attachment'

  end

@search variable depends on user filter, at extent which will put lot of load on RAM, UX isn't good. As other requests will be put on hold until the current request is served. Which is also making my system hang. So looking to run at background process, and letting user know once it's ready to download.

When I try to move to Model.

I get an error undefined method `send_data' for #<\Class:0x9f8bed0>

I'm moving to Model because I have to call delayed job on it.

Dealing with CSV and Delayed Job for first time.

Edit: ActionController::Streaming is available only in Controller so other way around? more often or not, this isn't going anywhere.

As D-Side answer says, I will have to look for other ways.

Edit2: Following http://railscasts.com/episodes/171-delayed-job I was able to do

class ExportCsv < Struct(:site_ids, :user_id)

def perform
    require "csv"
    sites = Site.where(id: site_ids)
    CSV.open("tmp/#{user_id}.csv", "w+") do |csv|
      csv <<  ["id","name", 'etc']
      sites.each do |site|
        csv << ....
      end
    end
  end


  def after(job)
    send_file(
      ....
      )
  end

end

How to use ActionController::Streaming inside a custom class ExportCsv, or Model

Edit:

Understanding about synchronization and how I dealt with the situation,

Answer : http://imnithin.github.io/csv_download_with_delayed_job.html

like image 661
Nithin Avatar asked Sep 29 '22 09:09

Nithin


1 Answers

The thing you're trying to do defeats the purpose of DelayedJob.

When a user makes a request, the server should make a response in order to fullfill it. The problem is, that some requests take quite a lot of time to complete, and the user has to hang on and wait until it's done. A classic case – massive email delivery, but there are others, as you've mentioned, like data suite generation. Whatever. It takes more time to complete than time you can afford for your user to wait.

Now here comes DelayedJob. It executes a certain action without a context of a query to respond. It doesn't need to hurry. But you can't just slap send_data for it: there won't be any query for it to respond to. Instead, it should write the results of the job done into some persistent storage.

You have a number of ways of pulling this off.

  • You can have your user notified via email when the dataset is ready. You can even attach it to the email, but I would not recommend it: you can't rely on email providers' readiness to accept a large chunk of data. Make a link for downloading the dataset instead and send it.
    • DelayedJob will need to render the dataset, save it into the file, get the link and send it in an email to a user.
  • Make a section (a company of model, controller and views) of your app that sounds like "Completed requests", might be a part of user's profile. The "launch" request should instruct the user to come back later and look into that list to get the result.
    • DelayedJob will have to make an entry in that list there after the request is fullfilled. How the dataset is stored is irrelevant, but you could combine it with a way above, save it into a file and display a link to it.
like image 117
D-side Avatar answered Oct 03 '22 08:10

D-side