Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RSPEC Let vs Instance with expensive object creation

In RSPEC, the behavior of Let is to memoize across a single example ( it block ), but in some cases this can lead to some potentially nasty side effects as far as timing.

I've noticed that if you manage to try and create anything that would be considered expensive, such as a large mock, the entirety of the object creation will be repeated each and every single example it's called in.

The first step to troubleshooting this was hacking the mock data down to size, which cut a majority of the run time down from ~30 seconds to ~.08 seconds. That given, by transferring a let variable that's being called 3+ times without any form of mutation to an instance, the speed can be increased even more (-0.02 to -0.04 in this case).

Normally, it could be reasoned that the lazy evaluation is desirable and that such things are a price of safety in some cases. In the context of a large test suite (3000+ tests) a difference of even 0.01-0.02 seconds often enough can lead to 20-30 seconds of bloat. Of course this is arbitrary numbering in some cases, but you can see why this would be undesirable and create a compounding problem.

The questions I have are:

  • In what cases is let no longer a viable option?
  • Is there any way to stretch its' memoization across a block context instead of example context? ...or is that a horrid idea?
  • Are there efficient ways to generate exceptionally large amounts of mock data that aren't likely to take 7+ seconds to load? I've seen vague references to Factory Girl but there's enough fighting on that note I don't know what to think in a current context.

Thank you for your time!

like image 957
baweaver Avatar asked Dec 26 '13 22:12

baweaver


1 Answers

As you seem to be aware, let basically just keeps you from evaluating the variable except in examples where you use it. If you use it in ten examples, you will indeed get ten hits to the expensive operation.

So for your first question, I don't know that I can offer a useful answer. It's pretty situational, but I'd say let isn't viable if you're using the variable a lot and it's an expensive operation. But depending on your needs, it might still be the best option - maybe you have to have the state reset in most examples, but not all. In that case, the expense of the operation might not be worth the pain of trying to share it in just a few cases.

For your second question, I'd say it's probably not a good idea to try and make let work within a block. That's a case for a before(:all) block and an instance variable.

Your third question is where the real meat is, I think, so bear with me here.

FactoryGirl isn't really going to change your problem. It will build and optionally save objects, but you still have to decide where and how to use it. If you start popping it into before(:each) blocks, or calling a builder in most examples, you'll still have performance hits.

Depending on your needs, you could do expensive operations in a before(:all) block or even a before(:suite) block (configuring in your spec_helper.rb, for instance). This has the advantage of giving you fewer hits to the expensive operation, but the downside is that if you're modifying the data, it's modified for all other tests. This can obviously cause a lot of difficult-to-debug problems. If your data needs to be changed by multiple examples, and then reset to a pristine state, you're going to be stuck with some kind of performance hit or else custom logic of your own design.

If your data is primarily in ActiveRecord objects, and you aren't keen on stubbing/mocking to keep from hitting the database, chances are you're stuck with slow tests. Fixtures can be used with transactions to help a bit, and can be faster than factories, but can be a pain to maintain depending on your database schema, relationships, etc. I believe you can use factories in a before(:suite) block, and then transactions will still work, but that isn't necessarily significantly easier to maintain than fixtures.

If your data is just CPU-expensive objects as opposed to database records, you could set up a bunch of objects and serialize them via the Marshal module. Then you can load them up in a let block, prebuilt and ready, with just a disk hit (or memory, if you store the Marshalled string in memory):

# In irb or pry or even spec_helper.rb
object = SomeComplexThing.new
object.prepare_it_with_expensive_method_call_fun
Marshal.dump(object) # Store the output of this somewhere

# In some_spec.rb
let(:thing) { Marshal.load(IO.read("serialized_thing")) }

This has the advantage of serializing the object's state in full, and restoring it exactly as it was without re-computing expensive data. This probably won't work as well for really complex objects like an ActiveRecord model, but it can be handy for simpler data structures of your own design. You can even implement your own dumping / loading logic by implementing marshal_dump and marshal_load methods (see the Marshal docs I linked above), which can be handy outside of tests.

If your data is simple enough, you may even be able to get away with a setup like this:

# In spec_helper.rb
RSpec.configure do |config|
  config.before(:suite) do
    @object = SomeComplexThing.new
    @object.prepare_it_with_expensive_method_call_fun
  end
end

# In a test
let(:thing) { @object.dup }

This isn't necessarily going to work in all cases, as dup is a shallow copy (see the Ruby docs for more info), but you get the idea - you're building a copy rather than re-computing whatever expensive stuff is hurting you.


I hope this information helps, as I'm not sure I fully understand exactly what you need.

like image 126
Nerdmaster Avatar answered Oct 29 '22 23:10

Nerdmaster