Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django: merging objects

I have such model:

class Place(models.Model):
    name = models.CharField(max_length=80, db_index=True)
    city = models.ForeignKey(City)
    address = models.CharField(max_length=255, db_index=True)
    # and so on

Since I'm importing them from many sources, and users of my website are able to add new Places, I need a way to merge them from an admin interface. Problem is, name is not very reliable since they can be spelled in many different ways, etc I'm used to use something like this:

class Place(models.Model):
    name = models.CharField(max_length=80, db_index=True) # canonical
    city = models.ForeignKey(City)
    address = models.CharField(max_length=255, db_index=True)
    # and so on

class PlaceName(models.Model):
    name = models.CharField(max_length=80, db_index=True)
    place = models.ForeignKey(Place)

query like this

Place.objects.get(placename__name='St Paul\'s Cathedral', city=london)

and merge like this

class PlaceAdmin(admin.ModelAdmin):
    actions = ('merge', )

    def merge(self, request, queryset):
        main = queryset[0]
        tail = queryset[1:]

        PlaceName.objects.filter(place__in=tail).update(place=main)
        SomeModel1.objects.filter(place__in=tail).update(place=main)
        SomeModel2.objects.filter(place__in=tail).update(place=main)
        # ... etc ...

        for t in tail:
            t.delete()

        self.message_user(request, "%s is merged with other places, now you can give it a canonical name." % main)
    merge.short_description = "Merge places"

as you can see, I have to update all other models with FK to Place with new values. But it's not very good solution since I have to add every new model to this list.

How do I "cascade update" all foreign keys to some objects prior to deleting them?

Or maybe there are other solutions to do/avoid merging

like image 821
Valentin Golev Avatar asked Aug 03 '10 03:08

Valentin Golev


3 Answers

If anyone intersted, here is really generic code for this:

def merge(self, request, queryset):
    main = queryset[0]
    tail = queryset[1:]

    related = main._meta.get_all_related_objects()

    valnames = dict()
    for r in related:
        valnames.setdefault(r.model, []).append(r.field.name)

    for place in tail:
        for model, field_names in valnames.iteritems():
            for field_name in field_names:
                model.objects.filter(**{field_name: place}).update(**{field_name: main})

        place.delete()

    self.message_user(request, "%s is merged with other places, now you can give it a canonical name." % main)
like image 161
Valentin Golev Avatar answered Sep 19 '22 22:09

Valentin Golev


Based on the snippet provided in the comments in the accepted answer, I was able to develop the following. This code does not handle GenericForeignKeys. I don't ascribe to their use as I believe it indicates a problem with the model you are using.

I used list a lot of code to do this in this answer, but I have updated my code to use django-super-deduper mentioned here. At the time, django-super-deduper did not handle unmanaged models in a good way. I submitted an issue, and it looks like it will be corrected soon. I also use django-audit-log, and I don't want to merge those records. I kept the signature and the @transaction.atomic() decorator. This is helpful in the event of a problem.

from django.db import transaction
from django.db.models import Model, Field
from django_super_deduper.merge import MergedModelInstance


class MyMergedModelInstance(MergedModelInstance):
    """
        Custom way to handle Issue #11: Ignore models with managed = False
        Also, ignore auditlog models.
    """
    def _handle_o2m_related_field(self, related_field: Field, alias_object: Model):
        if not alias_object._meta.managed and "auditlog" not in alias_object._meta.model_name:
            return super()._handle_o2m_related_field(related_field, alias_object)

    def _handle_m2m_related_field(self, related_field: Field, alias_object: Model):
        if not alias_object._meta.managed and "auditlog" not in alias_object._meta.model_name:
            return super()._handle_m2m_related_field(related_field, alias_object)

    def _handle_o2o_related_field(self, related_field: Field, alias_object: Model):
        if not alias_object._meta.managed and "auditlog" not in alias_object._meta.model_name:
            return super()._handle_o2o_related_field(related_field, alias_object)


@transaction.atomic()
def merge(primary_object, alias_objects):
    if not isinstance(alias_objects, list):
        alias_objects = [alias_objects]
    MyMergedModelInstance.create(primary_object, alias_objects)
    return primary_object
like image 27
Bobort Avatar answered Sep 19 '22 22:09

Bobort


Tested on Django 1.10. Hope it can serve.

def merge(primary_object, alias_objects, model):
"""Merge 2 or more objects from the same django model
The alias objects will be deleted and all the references 
towards them will be replaced by references toward the 
primary object
"""
if not isinstance(alias_objects, list):
    alias_objects = [alias_objects]

if not isinstance(primary_object, model):
    raise TypeError('Only %s instances can be merged' % model)

for alias_object in alias_objects:
    if not isinstance(alias_object, model):
        raise TypeError('Only %s instances can be merged' % model)

for alias_object in alias_objects:
    # Get all the related Models and the corresponding field_name
    related_models = [(o.related_model, o.field.name) for o in alias_object._meta.related_objects]
    for (related_model, field_name) in related_models:
        relType = related_model._meta.get_field(field_name).get_internal_type()
        if relType == "ForeignKey":
            qs = related_model.objects.filter(**{ field_name: alias_object })
            for obj in qs:
                setattr(obj, field_name, primary_object)
                obj.save()
        elif relType == "ManyToManyField":
            qs = related_model.objects.filter(**{ field_name: alias_object })
            for obj in qs:
                mtmRel = getattr(obj, field_name)
                mtmRel.remove(alias_object)
                mtmRel.add(primary_object)
    alias_object.delete()
return True
like image 44
alex59493 Avatar answered Sep 22 '22 22:09

alex59493