Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Saving with minimum number of database queries (minimizing save method call)

I'm processing a json file with a large amount of nested information. For this purpose, in the UploadElementFromExcelFile(APIView) class I use nested loops at the very end of which I call the serializer.save() method. Then in the ElementCommonInfoSerializer serializer in the create() method I save/update data received from the serializer. And in addition, I create/update a RfiParticipationStatus model, which is related only to the highest nesting level (uses parent_category variable). But since the create method is called for the lowermost member of the loop, I make a lot of useless database queries for the RfiParticipationStatus model. The same thing with the company_information save. This is extra information from a shared json file. It doesn't refer to the serialization object and I need to save the company_information only once at the very beginning of the post method call. In my code implementation, a saving request occurs hundreds of times depending on the depth and content of the for loop.

json data sample

[
    {
        "Company_info": [
            {
                "question": "Company name",
                "answer": "Test"
            },
            {
                "question": "Parent company (if applicable)",
                "answer": "2test"
            },
            {....},
            {....}
        ]
    },
    {
        "Parent Category": " rtS2P",
        "Category": [
            {
                "Analytics": [
                    {
                        "Data Schema": [
                            {   '....': "",
                                "Attachments/Supporting Docs and Location/Link": "tui",
                                "SM score": 4,
                                "Analyst notes": "tytyt"
                            },
                            {   '....': "",
                                "Attachments/Supporting Docs and Location/Link": null,
                                "SM score": null,
                                "Analyst notes": null
                            },
                        ]
                    },
                    {
                        "Data Management": [
                            {
                                '....': "",
                                "Attachments/Supporting Docs and Location/Link": null,
                                "SM score": null,
                                "Analyst notes": null
                            },
                            {....}
                        ]
                    }
                ]
            },
            {
                "Configurability": [...]
            }
        ]
    },
    {
        "Parent Category": "DFG",
        "Category": [
            {
                "Contingent Workforce / Services Procurement": [...]
            },
            {
                "Performance Management": [...]
            }
        ]
    },
        "Parent Category": "...",
         .....
]

views.py

class UploadElementFromExcelFile(APIView):
    serializer_class = ElementCommonInfoSerializer

    def post(self, request, *args, **kwargs):
        context = {'rfiid': kwargs.get('rfiid'), 'vendor': kwargs.get('vendor'), 'analyst': kwargs.get('analyst')}
        data = request.data  # data is list of dict
        company_information = next(iter(data))
        context.update(company_information)

        try:
            with transaction.atomic():
                for pc_data in data[1:]:  # from dict get PC and Category participate data, exclude firs element - CI
                    parent_category = pc_data.get('Parent Category')
                    category_data = pc_data.get('Category')
                    for data in category_data:
                        for category, values in data.items():  # Get category name
                            for subcats in values:
                                for subcat, element_list in subcats.items():  # Get subcategory name
                                    for num, element in enumerate(element_list, 1):  # Get element info
                                        # some logic
                                        data = {......, ......,}
                                        serializer = ElementCommonInfoSerializer(data=data, context=context)
                                        serializer.is_valid(raise_exception=True)
                                        serializer.save()
        except ValidationError:
            return Response({"errors": (serializer.errors,)},
                            status=status.HTTP_400_BAD_REQUEST)
        else:
            return Response(request.data, status=status.HTTP_200_OK)

serializer.py

class ElementCommonInfoSerializer(serializers.ModelSerializer):
    .....
    .....
    def create(self, validated_data):
        ....

        # !!!! Question is in the  rfi_part_status variable below

        rfi_part_status, _ = RfiParticipationStatus.objects.update_or_create(status=pc_status, vendor=vendor, rfi=round,
                                                                             pc=parent_category.first(),
                                                defaults={'last_vendor_response': lvr, 'last_analyst_response': lar})

        # And question else in company_information save
                company_information = self.context.get('Company_info')
        for ci in company_information:
            ciq, _ = CompanyGeneralInfoQuestion.objects.get_or_create(question=ci.get('question'), rfi=round)
            cia, _ = CompanyGeneralInfoAnswers.objects.get_or_create(vendor=vendor, question=ciq,
                                                                 answer=ci.get('answer'))

        #another crete logic
        .....
        .....

        return self

The question is how do I call the creation of the rfi_part_status object only at the moment of passing through the uppermost element of the nested loop (for pc_data in data[1:]:). And the same situation with the company information

UPD (in accordance with Linovia question)!! models.py

class RfiParticipationStatus(models.Model):
    status = models.CharField(max_length=50, choices=STATUS_NAME)
    vendor = models.ForeignKey('Vendors', models.DO_NOTHING, related_name='to_vendor_status')
    rfi = models.ForeignKey('Rfis', models.DO_NOTHING, related_name='to_rfis_status')
    pc = models.ForeignKey(ParentCategories, models.DO_NOTHING, blank=True, null=True)
    last_vendor_response = models.IntegerField(blank=True, null=True)
    last_analyst_response = models.IntegerField(blank=True, null=True)

when an RfiParticipationStatus object is created only pc(parent category) value taken from serializator data. All all other values are calculated in the process.

like image 426
Jekson Avatar asked Oct 27 '22 23:10

Jekson


1 Answers

you should keep the business logic in Django model level and override the save method and use super() call to the override method you used for your custome desired behavior.

like image 180
auvipy Avatar answered Dec 04 '22 23:12

auvipy