Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance Optimisation for connecting data in django models

At my job, I often have two tables in my django models and have to connect them to return this data as an csv for example. That data is not connected by a foreign key, but they have an identifier to connect them. This results from the fact, that we import this data from two different sources and sometimes the counterpart is missing, so I can't connect it while creating the entry.

My question is: What is the best way to connect this data in terms of performance if you think of the fact that I often have to return this data?

  1. Way: Create a new model that connects the data (like an m2m) or parent class with the identifier, that both are connected to.
class OrderInvoiceConnector(models.Model):
    order_data = models.ForeignKey(Order, related_name="invoice")
    invoice_data = models.ForeignKey(Invoice, related_name="order")
  1. Way: Create a new model that saves only the data that is needed for the csv export. Something like:
class ConnectedData(models.Model):
    invoice_id = models.CharField(max_length=255)
    country_iso = models.CharField(max_length=255)
    invoice_date = models.CharField(max_length=255)
    tax = models.FloatField(max_length=255)
    price = models.FloatField()
like image 626
herbertp. Avatar asked Dec 14 '15 13:12

herbertp.


1 Answers

I would go with second variant, as you mentioned that joining would be expensive, and data changes are produced on a daily basis. If you create a read-only model you'll be packing all required data for user consumption in a single table query. You'd have to populate data with some automated job, but that looks acceptable in your mentioned scenario.

like image 172
Lorenzo Peña Avatar answered Nov 02 '22 23:11

Lorenzo Peña