Recently I'm facing issues in update_or_create method. Let me give a full explanation first.
Model:
class TransactionPageVisits(models.Model):
transactionid = models.ForeignKey(
Transaction,
on_delete=models.CASCADE,
db_column='transactionid',
)
sessionid = models.CharField(max_length=40, db_index=True)
ip_address = models.CharField(max_length=39, editable=False)
user_agent = models.TextField(null=True, editable=False)
page = models.CharField(max_length=100, null=True, db_index=True)
method = models.CharField(max_length=20, null=True)
url = models.TextField(null=False, editable=False)
created_dtm = models.DateTimeField(auto_now_add=True)
class Meta(object):
ordering = ('created_dtm',)
Function:
def _tracking(self, request, response, **kwargs):
txn_details = kwargs.get('txn_details')
data = {
'sessionid': request.session.session_key,
'ip_address': get_ip_address(request),
'user_agent': get_user_agent(request),
'method': request.method,
'url': request.build_absolute_uri(),
'transactionid': txn_details.txn_object,
'page': kwargs.get('page')
}
# Keep updating/creating tracking data to model
obj, created = TransactionPageVisits.objects.update_or_create(**data)
Notes:
I know I'm not passing any defaults arguments to update_or_create(), as at the time the code was written it was not required (wanted to create a new row only when all the columns as per data is collectively unique). Also _tracking() is in middleware and will be called in each request and response.
Everything was going smoothly until today I got following exception:
File "trackit.py", line 65, in _tracking
obj, created = TransactionPageVisits.objects.update_or_create(**data)
File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/query.py", line 488, in update_or_create
obj = self.get(**lookup)
File "/usr/local/lib/python2.7/dist-packages/Django-1.10.4-py2.7.egg/django/db/models/query.py", line 389, in get
(self.model._meta.object_name, num)
MultipleObjectsReturned: get() returned more than one TransactionPageVisits -- it returned 2!
I noticed that there were two entries created in the table with exactly same value (except created_dtm as it was having auto_add_now=True):
| id | sessionid | ip_address | user_agent | page | method | url | created_dtm | transactionid |
| 32858 | nrq2vwxbtsjp8yoibotpsur0zit5jhoq | xx.xxx.xxx.xxx | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0 | | GET | https://www.example.com/example_url/?jobid=5a9f2acb4cedfd00011c7d5d&transactionid=XXXXXXXXXXXX | 2018-03-06 23:57:00.061280 | XXXXXXXXXXXX |
| 32859 | nrq2vwxbtsjp8yoibotpsur0zit5jhoq | xx.xxx.xxx.xxx | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0 | | GET | https://www.example.com/example_url/?jobid=5a9f2acb4cedfd00011c7d5d&transactionid=XXXXXXXXXXXX | 2018-03-06 23:57:00.062121 | XXXXXXXXXXXX |
Why at the first place a duplicate entry created in the table?
A QuerySet is evaluated when you call len() on it. This, as you might expect, returns the length of the result list. Note: If you only need to determine the number of records in the set (and don't need the actual objects), it's much more efficient to handle a count at the database level using SQL's SELECT COUNT(*) .
The Solution You can also use the chain() method from the Itertools module, which allows you to combine two or more QuerySets from different models through concatenation. Alternatively, you can use union() to combine two or more QuerySets from different models, passing all=TRUE if you want to allow duplicates.
Django annotations 2 are a way of enriching the objects returned in QuerySets. That is, when you run queries against your models you can ask for new fields, whose values will be dynamically computed, to be added when evaluating the query. These fields will be accessible as if they were normal attributes of a model.
Django offers a QuerySet method called select_related() that allows you to retrieve related objects for one-to-many relationships. This translates to a single, more complex QuerySet, but you avoid additional queries when accessing the related objects. The select_related method is for ForeignKey and OneToOne fields.
update_or_create
is prone to a race condition, as described in the documentation:
As described above in get_or_create(), this method is prone to a race-condition which can result in multiple rows being inserted simultaneously if uniqueness is not enforced at the database level.
You can use unique_together
in the model, as suggested in another answer. I've never tested this, but apparently Django catches the IntegrityError
caused by these race conditions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With