Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

User's threads in Thin, Sinatra application on Heroku

I have a set of time-costly operations which are specific for every single user of my app, it's all encapsulated in a method (e.g write_collections method). In this method program communicates with Facebook and MongoDB. I want to run this method in a thread for every user.

    This thread is called in get '/' Sinatra route but result of thread (a state in database) is needed only on get '/calculate'. My idea is to run the thread on get '/' and to join it on get '/calculate' to ensure that all the user's data has written properly in database before calculating results for the user starts.

To illustrate:


Approach I

get "/" do    
  @user = @graph.get_object("me")

  data_thread = Thread.new do
    write_collections(@user)
  end

  session[:data_thread] = data_thread.object_id

  erb :index
end

get "/calculate" do
  begin
  # Is this safe enough?
  if ObjectSpace._id2ref(session[:data_thread]).alive?
      data_thread = ObjectSpace._id2ref(session[:data_thread])
      data_thread.join
  end
  rescue RangeError => range_err
    # session[:data_thread] is not id value
    # direct access to /calculate without session
  rescue TypeError => type_err
    # session[:data_thread] is nil
  end

  # do calculations based on state in database
  # and show results to user

  "<p>Under construction</p>"
end

    To find the proper thread on which specific user should waiting to join I currently use ObjectSpace._id2ref(session[:data_thread]).

  • Is it safe enough?

Detailed:

From the official Ruby docs for Object#object_id:

object_id → fixnum: Returns an integer identifier for obj. The same number will be returned on all calls to id for a given object, and no two active objects will share an id.

and for ObjectSpace:

The ObjectSpace module contains a number of routines that interact with the garbage collection facility and allow you to traverse all living objects with an iterator.

  • Is 'active object' from the first quote same as 'living object' from the second?

    Let's assume following situation:

  1. User A access '/' [now A thread is started with object_id a]
  2. Thread A is finished [it is not active anymore and it's object_id is released]
  3. User B access '/' [now B thread is started with the same object_id a (* is it possible?)]
  4. User A access '/calculate' [session[:data_thread] is a so ObjectSpace._id2ref(session[:data_thread]) is actually B thread.]
  5. Inconsistent state - user A is waiting for thread B.

    • Is this scenario possible within Sinatra, Thin, Heroku?

Approach II

configure do
  # map user_id to corresponding user's thread
  data_threads_hash = {}
  set :data_threads_hash, data_threads_hash
end

get "/" do    
  @user = @graph.get_object("me")

  data_thread = Thread.new do
    write_collections(@user)
  end

  session[:user_id] = @user['id']
  settings.data_threads_hash[session[:user_id]] = data_thread

  erb :index
end

get "/calculate" do

  if settings.data_threads_hash[session[:user_id]].alive?
    data_thread = settings.data_threads_hash[session[:user_id]]
    data_thread.join
    settings.data_threads_hash.delete session[:user_id]
  end

  # do calculations based on state in database
  # and show results to user

  "<p>Under construction</p>"

end

Detailed:

I tried this after reading Sinatra: README. Under Configuration:

Run once, at startup, in any environment ... You can access those options via settings ...

And under Scopes and Binding, Application/Class Scope:

Every Sinatra application corresponds to a subclass of Sinatra::Base. If you are using the top-level DSL (require 'sinatra'), then this class is Sinatra::Application, otherwise it is the subclass you created explicitly. At class level you have methods like get or before, but you cannot access the request or session objects, as there is only a single application class for all requests.

I'm using the top-level DSL.

Options created via set are methods at class level ... You can reach the scope object (the class) like this: settings from within the request scope

  • Bearing in mind what has been said @FrederickCheung in comments and quote from Scopes and Binding, whether this approach would worked if I ever need more than one dyno/worker (currently this app using only one dyno)?

Summary

  • How should I handle described situation with users and their corresponding threads in Sinatra, how good or bad are approaches from the examples above?

Any comment or reference is welcome.

like image 267
foki Avatar asked Oct 31 '22 12:10

foki


1 Answers

I'm not sure if this design makes sense. Why does each user need its own thread? Why would one request ever join a thread created by another request? Even if it were possible (by using only one dyno), I don't think it is a good way of doing what you want to do.

Based on the description of the app in the question, you want to run some calculate method after the write_collections method finishes. So, why can't the write_collections method call some calculate method? Or, why can't an after filter or observer be used to do the calculation?

More generally, you seem to be confounding two separate functions:

  1. The calculate functionality
  2. GET /calculate

I think there should only be one trigger for calling the calculate functionality. It is either upon completion of write_collections or when the user requests it (GET /calculate).

The more general solution is to have the calculate functionality run in the background, and save the results to the database. Later, when the user makes a request, it is ready and can be returned quickly.

like image 138
B Seven Avatar answered Nov 15 '22 06:11

B Seven