I am using factory.LazyAttribute
within a SubFactory
call to pass in an object, created in the factory_parent
. This works fine.
But if I pass the object created to a RelatedFactory
, LazyAttribute
can no longer see the factory_parent
and fails.
This works fine:
class OKFactory(factory.DjangoModelFactory):
class = Meta:
model = Foo
exclude = ['sub_object']
sub_object = factory.SubFactory(SubObjectFactory)
object = factory.SubFactory(ObjectFactory,
sub_object=factory.LazyAttribute(lambda obj: obj.factory_parent.sub_object))
The identical call to LazyAttribute
fails here:
class ProblemFactory(OKFactory):
class = Meta:
model = Foo
exclude = ['sub_object', 'object']
sub_object = factory.SubFactory(SubObjectFactory)
object = factory.SubFactory(ObjectFactory,
sub_object=factory.LazyAttribute(lambda obj: obj.factory_parent.sub_object))
another_object = factory.RelatedFactory(AnotherObjectFactory, 'foo', object=object)
The identical LazyAttribute
call can no longer see factory_parent, and can only access AnotherObject
values. LazyAttribute throws the error:
AttributeError: The parameter sub_object is unknown. Evaluated attributes are...[then lists all attributes of AnotherObjectFactory]
Is there a way round this?
I can't just put sub_object=sub_object into the ObjectFactory call, ie:
sub_object = factory.SubFactory(SubObjectFactory)
object = factory.SubFactory(ObjectFactory, sub_object=sub_object)
because if I then do:
object2 = factory.SubFactory(ObjectFactory, sub_object=sub_object)
a second sub_object is created, whereas I need both objects to refer to the same sub_object. I have tried SelfAttribute
to no avail.
I think you can leverage the ability to override parameters passed in to the RelatedFactory
to achieve what you want.
For example, given:
class MyFactory(OKFactory):
object = factory.SubFactory(MyOtherFactory)
related = factory.RelatedFactory(YetAnotherFactory) # We want to pass object in here
If we knew what the value of object
was going to be in advance, we could make it work with something like:
object = MyOtherFactory()
thing = MyFactory(object=object, related__param=object)
We can use this same naming convention to pass the object to the RelatedFactory
within the main Factory
:
class MyFactory(OKFactory):
class Meta:
exclude = ['object']
object = factory.SubFactory(MyOtherFactory)
related__param = factory.SelfAttribute('object')
related__otherrelated__param = factory.LazyAttribute(lambda myobject: 'admin%d_%d' % (myobject.level, myobject.level - 1))
related = factory.RelatedFactory(YetAnotherFactory) # Will be called with {'param': object, 'otherrelated__param: 'admin1_2'}
I solved this by simply calling factories within @factory.post_generation
. Strictly speaking this isn't a solution to the specific problem posed, but I explain below in great detail why this ended up being a better architecture. @rhunwick's solution does genuinely pass a SubFactory(LazyAttribute(''))
to RelatedFactory
, however restrictions remained that meant this was not right for my situation.
We move the creation of sub_object
and object
from ProblemFactory
to ObjectWithSubObjectsFactory
(and remove the exclude
clause), and add the following code to the end of ProblemFactory
.
@factory.post_generation
def post(self, create, extracted, **kwargs):
if not create:
return # No IDs, so wouldn't work anyway
object = ObjectWithSubObjectsFactory()
sub_object_ids_by_code = dict((sbj.name, sbj.id) for sbj in object.subobject_set.all())
# self is the `Foo` Django object just created by the `ProblemFactory` that contains this code.
for another_obj in self.anotherobject_set.all():
if another_obj.name == 'age_in':
another_obj.attribute_id = sub_object_ids_by_code['Age']
another_obj.save()
elif another_obj.name == 'income_in':
another_obj.attribute_id = sub_object_ids_by_code['Income']
another_obj.save()
So it seems RelatedFactory
calls are executed before PostGeneration
calls.
The naming in this question is easier to understand, so here is the same solution code for that sample problem:
The creation of dataset
, column_1
and column_2
are moved into a new factory DatasetAnd2ColumnsFactory
, and the code below is then added to the end of FunctionToParameterSettingsFactory
.
@factory.post_generation
def post(self, create, extracted, **kwargs):
if not create:
return
dataset = DatasetAnd2ColumnsFactory()
column_ids_by_name =
dict((column.name, column.id) for column in dataset.column_set.all())
# self is the `FunctionInstantiation` Django object just created by the `FunctionToParameterSettingsFactory` that contains this code.
for parameter_setting in self.parametersetting_set.all():
if parameter_setting.name == 'age_in':
parameter_setting.column_id = column_ids_by_name['Age']
parameter_setting.save()
elif parameter_setting.name == 'income_in':
parameter_setting.column_id = column_ids_by_name['Income']
parameter_setting.save()
I then extended this approach passing in options to configure the factory, like this:
whatever = WhateverFactory(options__an_option=True, options__another_option=True)
Then this factory code detected the options and generated the test data required (note the method is renamed to options
to match the prefix on the parameter names):
@factory.post_generation
def options(self, create, not_used, **kwargs):
# The standard code as above
if kwargs.get('an_option', None):
# code for custom option 'an_option'
if kwargs.get('another_option', None):
# code for custom option 'another_option'
I then further extended this. Because my desired models contained self joins, my factory is recursive. So for a call such as:
whatever = WhateverFactory(options__an_option='xyz',
options__an_option_for_a_nested_whatever='abc')
Within @factory.post_generation
I have:
class Meta:
model = Whatever
# self is the top level object being generated
@factory.post_generation
def options(self, create, not_used, **kwargs):
# This generates the nested object
nested_object = WhateverFactory(
options__an_option=kwargs.get('an_option_for_a_nested_whatever', None))
# then join nested_object to self via the self join
self.nested_whatever_id = nested_object.id
Some notes you do not need to read as to why I went with this option rather than @rhunwicks's proper solution to my question above. There were two reasons.
The thing that stopped me experimenting with it was that the order of RelatedFactory and post-generation is not reliable - apparently unrelated factors affect it, presumably a consequence of lazy evaluation. I had errors where a set of factories would suddenly stop working for no apparent reason. Once was because I renamed the variables RelatedFactory were assigned to. This sounds ridiculous but I tested it to death (and posted here) but there is no doubt - renaming the variables reliably switched the sequence of RelatedFactory and post-gen execution. I still assumed this was some oversight on my behalf until it happened again for some other reason (which I never managed to diagnose).
Secondly I found the declarative code confusing, inflexible and hard to re-factor. It isn't straightforward to pass different configurations during instantiation so that the same factory can be used for different variations of test data, meaning I had to repeat code, object
needs adding to a Factory Meta.exclude
list - sounds trivial but when you've pages of code generating data it was a reliable error. As a developer you'd have to pass over several factories several times to understand the control flow. Generation code would be spread between the declarative body, until you'd exhausted these tricks, then the rest would go in post-generation or get very convoluted. A common example for me is a triad of interdependent models (eg, a parent-children category structure or dataset/attributes/entities) as a foreign key of another triad of inter-dependent objects (eg, models, parameter values, etc, referring to other models' parameter values). A few of these types of structures, especially if nested, quickly become unmanagable.
I realize it isn't really in the spirit of factory_boy, but putting everything into post-generation solved all these problems. I can pass in parameters, so the same single factory serves all my composite model test data requirements and no code is repeated. The sequence of creation is easy to see immediately, obvious and completely reliable, rather than depending on confusing chains of inheritance and overriding and subject to some bug. The interactions are obvious so you don't need to digest the whole thing to add some functionality, and different areas of funtionality are grouped in the post-generation if clauses. There's no need to exclude working variables and you can refer to them for the duration of the factory code. The unit test code is simplified, because describing the functionality goes in parameter names rather than Factory class names - so you create data with a call like WhateverFactory(options__create_xyz=True, options__create_abc=True..
, rather than WhateverCreateXYZCreateABC..()
. This makes a nice division of responsibilities quite clean to code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With