Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL INSERT ... ON DUPLICATE KEY UPDATE with django 1.4 for bulk insert

I am having issues figuring out MySQL INSERT ... ON DUPLICATE KEY UPDATE with django 1.4.

The table that I am trying to insert records has a 2 column(composite) unique key. Records that I am receiving is from a 3rd party source and values will change over time except for those fields that makes the unique key set. I am receiving 1 ~ 5k records at a time, and would need to

Currently I am using Model.objects.bulk_create to bulk insert, performance is really amazing as it issues generally one query no matter how big the record set is. However, as my records can change over time on the 3rd party end, I need to perform the MySQL INSERT ... ON DUPLICATE KEY UPDATE query on the recordset.

I am planning to write raw SQL statements and execute using something like here:

sql = "MySQL INSERT ... ON DUPLICATE KEY UPDATE"

raw_insert(sql)

def raw_insert(sql):
    from django.db import connection, transaction
    cursor = connection.cursor()

    # Data modifying operation - commit required
    cursor.execute(sql)
    transaction.commit_unless_managed()

    return 1

Wondering if there is a better solution to my problem. Also how would I sanitize the field values for raw insert?

like image 872
mmohiudd Avatar asked Oct 24 '25 08:10

mmohiudd


2 Answers

So I created a custom manager. Here is the manager:

class BulkInsertManager(models.Manager):
    def _bulk_insert_or_update(self, create_fields, update_fields, values):

        from django.db import connection, transaction
        cursor = connection.cursor()

        db_table = self.model._meta.db_table

        values_sql = []
        values_data =[]

        for value_lists in values:
            values_sql.append( "(%s)" % (','.join([ "%s" for i in range(len(value_lists))]),) )
            values_data.extend(value_lists)

        base_sql = "INSERT INTO %s (%s) VALUES " % (db_table, ",".join(create_fields))

        on_duplicates = []

        for field in update_fields:
            on_duplicates.append(field + "=VALUES(" + field +")")

        sql = "%s %s ON DUPLICATE KEY UPDATE %s" % (base_sql, ", ".join(values_sql), ",".join(on_duplicates))

        cursor.executemany(sql, [values_data])
        transaction.commit_unless_managed()

And a sample model:

class User_Friend(models.Model):
    objects = BulkInsertManager() # assign a custom manager to handle bulk insert

    id = models.CharField(max_length=255)
    user = models.ForeignKey(User, null=False, blank=False)
    first_name = models.CharField(max_length=30)
    last_name = models.CharField(max_length=30)
    city = models.CharField(max_length=50, null=True, blank=True)
    province = models.CharField(max_length=50, null=True, blank=True)
    country =  models.CharField(max_length=30, null=True, blank=True)

And sample implementation:

def save_user_friends(user, friends):
    user_friends = []
    for friend in friends:

        create_fields = ['id', 'user_id', 'first_name', 'last_name', 'city', 'province', 'country']
        update_fields = ['first_name', 'last_name', 'city', 'province', 'country']

        user_friends.append(
            [
                str(user.id), 
                str(friend['id']),
                friend['first_name'],
                friend['last_name'],
                friend['city'],
                friend['province'],
                friend['country'],
            ]
        )

    User_Friend.objects._bulk_insert_or_update(create_fields, update_fields, user_friends)

Here is the gist.

like image 174
mmohiudd Avatar answered Oct 26 '25 23:10

mmohiudd


You could sanitize with a ModelForm:

from django.forms.models import modelform_factory
form_class = modelform_factory(MyModel)

for obj in my_data:
    form = form_class(obj)
    if not form.is_valid():
        raise Hell()

As to the raw SQL, I say go for it. It doesn't appear that Django's ORM supports ON DUPLICATE KEY UPDATE, so don't let it get in your way. The Django docs talk about doing it without any reservation.

It may be worth using Manager.raw, though.

like image 26
dokkaebi Avatar answered Oct 26 '25 21:10

dokkaebi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!