Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combined unique_field in bulk_create in without checking PK

I am trying to import data from multiple API's and there is a chance of duplicates. I am trying to bulk-create without duplication. All API sources do not provide me with a unique identifier. I am wondering what the best way to handle this situation is. What I have tried is as follows:

 if 'rounds' in response:
    print('=== Syncing Rounds ===')
    rounds = response.get('rounds')
    objs = [
        Round(
            name         = item.get('name'),
            season       = Season.objects.get(id = item.get('seasonId')),
            competition  = Competition.objects.get(id = item.get('competitionId')),
            round_number = item.get('roundNumber'),
        )
        for item in rounds
    ]
    Round.objects.bulk_create(
        objs,update_conflicts=True, 
        update_fields=['name','season','competition','round_number'],
        unique_fields=['id'])

I tried setting ignore_conflicts = True but that approach didn't help me.

The round numbers range from 1-30 and the season is the year. In the given situation, I cannot make one field unique such as round number, season, or competition. It has to look for all three. For example There can be only one row for Round 1, 2023, for competition 112. This entire combination is unique.

Goal

The end goal is to either ensure no duplicate entries or update existing rows.

One hack (as said by OP) solution is Bulk insert on multi-column unique constraint Django

---Update--- Round Model

class Round (models.Model):
name         = models.CharField(max_length=100)
round_number = models.SmallIntegerField(null=True)
season       = models.ForeignKey(Season,on_delete=models.CASCADE)
competition  = models.ForeignKey(Competition,on_delete=models.CASCADE)
start        = models.DateTimeField(null=True,blank=True)
end          = models.DateTimeField(null=True,blank=True)
tries        = models.SmallIntegerField(default=0)
points       = models.SmallIntegerField(default=0)

class Meta:
    constraints = [
        models.UniqueConstraint(
            fields=['round_number','season','competition'], 
            name='unique_round')

I have tried using constraints but no dice

like image 811
Afnan Bashir Avatar asked Jan 01 '26 20:01

Afnan Bashir


1 Answers

Your unique_field can not be id, since that is one that is not determined by the object. The unique_field decides for which fields there should at least be one value that is different in order to update. In case the season, competition_id, and round_number are the same, we can for example update the name.

Your view also is not very efficient. Yes, the .bulk_create(…) [Django-doc] will save a lot of insert queries, but the main bottleneck is retrieving all competitions, etc. That is not necessary. If we know for sure the objects exist, we can work with:

if 'rounds' in response:
    objs = [
        Round(
            name=item['name'],
            season_id=item['seasonId'],
            competition_id=item['competitionId'],
            round_number=item['roundNumber'],
        )
        for item in response['rounds']
    ]
    Round.objects.bulk_create(
        objs,
        update_conflicts=True,
        update_fields=['name'],
        unique_fields=['season_id', 'competition_id', 'round_number'],
    )
like image 174
Willem Van Onsem Avatar answered Jan 03 '26 13:01

Willem Van Onsem



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!