Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ActiveJob GlobalID and in-memory ActiveRecord objects

I'm using a queuing system (Sidekiq) and want to move to ActiveJob to gain a performance benefit of not having to query the database every time I pass an ActiveRecord object to a worker. I wanted to ask and confirm since I wasn't 100% sure but my understanding is that when ActiveJob is using the GlobalID to pass ActiveRecord objects that is all done in memory and a separate query to the database is not done, correct?

like image 218
Ben Nelson Avatar asked Oct 27 '25 04:10

Ben Nelson


2 Answers

That is not correct.

If you use ActiveJob it will serialize any ActiveRecord object into a global_id string for saving into your queue. Then look it up again from that string when the job starts. By default that string only includes the app name, class name and id, and it will use your database to load the model.

"gid://app/User/1"

DelayedJob will serialize any object you give it into a yaml string and unserialize it without hitting the DB beyond loading the job. You can do this with Sidekiq too, instead hitting Redis to load the job and not touching the primary database.

user = User.find(1)
MyJob.perform_later(user.to_yaml)

# Load the user object from the yaml
YAML::load(user.to_yaml) == user # true

You'll get your object without a trip to the DB. However that YAML is going to be large, and the performance penalty you get with Redis might not be worth it.

There are a few more gotchas you should look out for. The object might be out of date, in both terms of data and of structure. If you change your code, serialized object may have trouble loading again due to structure changes. And if you update the database after serializing the object, when you load it, you'll be working unknowingly with old data.

Hope that helps you understand what ActiveJob and GlobalId provide.

like image 67
reconbot Avatar answered Oct 28 '25 17:10

reconbot


The database query will be performed anyway, but transparently. Take the following code as an example of what ActiveJob does internally:

gid = User.find(1).to_global_id
  User Load (0.8ms)  SELECT  "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2  [["id", 1], ["LIMIT", 1]]
=> #<GlobalID:0x00007f86f76f46d8 @uri=#<URI::GID gid://app/User/1>>

Then, when the job is performed, ActiveJob runs the following code internally, which queries the database anyway:

GlobalID::Locator.locate(gid)
  User Load (0.3ms)  SELECT  "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2  [["id", 1], ["LIMIT", 1]]
=> #<User id: 1, ... >

A problem of using GlobalIDs is that, if a passed record is deleted after the job is enqueued but before the #perform method is called, Active Job will raise an ActiveJob::DeserializationError exception.

Performance

According to Mike Perham, the author of Sidekiq, benchmarks show that ActiveJob is 2-20x times slower pushing jobs to Redis and has roughly 3x the processing overhead (https://github.com/mperham/sidekiq/wiki/Active-Job#performance).

Additional information

All the information regarding Sidekiq, ActiveJob and GlobalID can be found here: https://github.com/mperham/sidekiq/wiki/Active-Job#using-global-id

like image 35
Pere Joan Martorell Avatar answered Oct 28 '25 18:10

Pere Joan Martorell



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!