Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django QuerySet .count() is 0 and .exists() is false, even though there's an object in the QuerySet (Django Rest Framework)

I've had an intermittent bug for months and I am stuck! Help is appreciated.

class ContentFile(models.Model):
    """
    Represents the metadata for a single document.
    """
    name = models.CharField(max_length=200)
    set = models.ForeignKey(Set, related_name='contentfile_set', on_delete=models.CASCADE)
    location = models.FileField(upload_to=get_content_file_upload_path)

    class Meta:
        app_label = 'ProtocolManager'
        unique_together = ('set', 'location')

    def __str__(self):
        return f"File {self.name} at location {str(self.location)} attached to {self.set}"


@receiver(models.signals.post_delete, sender=ContentFile)
def delete_file(sender, instance, *args, **kwargs):
    """
    Delete orphaned file once last ContentFile is removed.
    """
    log.info(f"Checking {instance.location} for existing ContentFile")
    if instance.location:
        from django.db import connection
        from django.db import reset_queries
        reset_queries()
        files_queryset = ContentFile.objects.filter(location=instance.location)
        if not files_queryset.exists():
            log.info(f"ContentFile does not exist for {instance.location}, deleting...")
            log.info(f"SQL: {connection.queries}")
            log.info(f"files_queryset: {list(files_queryset.all())}")
            log.info(f"count: {files_queryset.count()}")
            log.info(f"exists: {files_queryset.exists()}")
            instance.location.delete(save=False)

The logic is pretty simple. When a ContentFile row is deleted, if it was the last one referring to an actual file location, we delete the file itself (on AWS S3, if it matters). 99% of the time this works as expected, but sometimes it will fail, and when it fails, it seems to fail for all of the ContentFiles associated with a given Set. I added the verbose logging to try to get more clues, but today it finally reproduced for the first time in months and the logging is just frustrating:

...
2021-06-08 20:30:05,329 [INFO] ProtocolManager.models: Checking 1129/314.pdf for existing ContentFile
2021-06-08 20:30:05,335 [INFO] ProtocolManager.models: ContentFile does not exist for 1129/314.pdf, deleting...
2021-06-08 20:30:05,335 [INFO] ProtocolManager.models: SQL: [{'sql': "SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/314.pdf' LIMIT 1", 'time': '0.005'}]
2021-06-08 20:30:05,604 [INFO] ProtocolManager.models: files_queryset: [<ContentFile: File hypovolemic-traumatic-shockpdf.pdf at location 1129/314.pdf attached to Set v2021.6.8 '586-Joachim-Plattin Ambulance PCG' (1130-EDIT)>]
2021-06-08 20:30:05,610 [INFO] ProtocolManager.models: count: 0
2021-06-08 20:30:05,618 [INFO] ProtocolManager.models: exists: False
2021-06-08 20:30:05,685 [INFO] ProtocolManager.models: Checking 1116/312.pdf for existing ContentFile
2021-06-08 20:30:05,690 [INFO] ProtocolManager.models: Checking 1109/310.pdf for existing ContentFile
2021-06-08 20:30:05,700 [INFO] ProtocolManager.models: Checking 1101/306.pdf for existing ContentFile
2021-06-08 20:30:05,705 [INFO] ProtocolManager.models: Checking 1095/304.pdf for existing ContentFile
2021-06-08 20:30:05,711 [INFO] ProtocolManager.models: Checking 1089/301.pdf for existing ContentFile
2021-06-08 20:30:05,721 [INFO] ProtocolManager.models: Checking 1083/296.pdf for existing ContentFile
2021-06-08 20:30:05,726 [INFO] ProtocolManager.models: Checking 1080/292.pdf for existing ContentFile
2021-06-08 20:30:05,734 [INFO] ProtocolManager.models: Checking 1077/290.pdf for existing ContentFile
2021-06-08 20:30:05,739 [INFO] ProtocolManager.models: Checking 1075/289.pdf for existing ContentFile
2021-06-08 20:30:05,747 [INFO] ProtocolManager.models: Checking 1072/287.pdf for existing ContentFile
2021-06-08 20:30:05,754 [INFO] ProtocolManager.models: Checking 1066/285.pdf for existing ContentFile
2021-06-08 20:30:05,763 [INFO] ProtocolManager.models: Checking 1129/307.pdf for existing ContentFile
2021-06-08 20:30:05,768 [INFO] ProtocolManager.models: ContentFile does not exist for 1129/307.pdf, deleting...
2021-06-08 20:30:05,768 [INFO] ProtocolManager.models: SQL: [{'sql': "SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/307.pdf' LIMIT 1", 'time': '0.004'}]
2021-06-08 20:30:05,943 [INFO] ProtocolManager.models: files_queryset: [<ContentFile: File 0004-ift-medications.pdf at location 1129/307.pdf attached to Set v2021.6.8 '586-Joachim-Plattin Ambulance PCG' (1130-EDIT)>]
2021-06-08 20:30:05,947 [INFO] ProtocolManager.models: count: 0
2021-06-08 20:30:05,952 [INFO] ProtocolManager.models: exists: False
2021-06-08 20:30:05,982 [INFO] ProtocolManager.models: Checking 855/123.pdf for existing ContentFile
2021-06-08 20:30:05,987 [INFO] ProtocolManager.models: Checking 1096/122.pdf for existing ContentFile
2021-06-08 20:30:05,992 [INFO] ProtocolManager.models: Checking 855/121.pdf for existing ContentFile
2021-06-08 20:30:05,997 [INFO] ProtocolManager.models: Checking 1116/295.pdf for existing ContentFile
...

In the log it shows that there is in fact the expected ContentFile row at the location in question (the output from list(files_queryset.all())) but both count() is 0 and exists() is false.

Does anyone have any ideas why this might be happening? Intermittent database issues to AWS RDS, perhaps?

The database log in the same period (note I removed some extraneous fields from the Model above for clarity, you still see them in the DB queries themselves):

2021-06-08 20:30:05,335 [DEBUG] django.db.backends: (0.005) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/314.pdf' LIMIT 1; args=('1129/314.pdf',)
2021-06-08 20:30:05,593 [DEBUG] django.db.backends: (0.257) SELECT `ProtocolManager_contentfile`.`id`, `ProtocolManager_contentfile`.`name`, `ProtocolManager_contentfile`.`set_id`, `ProtocolManager_contentfile`.`location`, `ProtocolManager_contentfile`.`extractedText`, `ProtocolManager_contentfile`.`pages`, `ProtocolManager_contentfile`.`md5`, `ProtocolManager_contentfile`.`fileSize`, `ProtocolManager_contentfile`.`createdOn`, `ProtocolManager_contentfile`.`createdBy_id`, `ProtocolManager_contentfile`.`lastUpdated`, `ProtocolManager_contentfile`.`updatedBy_id` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/314.pdf'; args=('1129/314.pdf',)
2021-06-08 20:30:05,602 [DEBUG] django.db.backends: (0.007) SELECT `ProtocolManager_set`.`id`, `ProtocolManager_set`.`setMeta_id`, `ProtocolManager_set`.`originSet_id`, `ProtocolManager_set`.`majorVersion`, `ProtocolManager_set`.`minorVersion`, `ProtocolManager_set`.`revisionVersion`, `ProtocolManager_set`.`acknowledgmentSetID`, `ProtocolManager_set`.`receiptAckText`, `ProtocolManager_set`.`pubLevel`, `ProtocolManager_set`.`minOSiOS`, `ProtocolManager_set`.`minOSAndroid`, `ProtocolManager_set`.`minVersioniOS`, `ProtocolManager_set`.`minVersionAndroid`, `ProtocolManager_set`.`createdOn`, `ProtocolManager_set`.`createdBy_id`, `ProtocolManager_set`.`lastUpdated`, `ProtocolManager_set`.`updatedBy_id` FROM `ProtocolManager_set` WHERE `ProtocolManager_set`.`id` = 1130 LIMIT 21; args=(1130,)
2021-06-08 20:30:05,604 [DEBUG] django.db.backends: (0.001) SELECT `ProtocolManager_setmeta`.`id`, `ProtocolManager_setmeta`.`region_id`, `ProtocolManager_setmeta`.`setName`, `ProtocolManager_setmeta`.`displayName`, `ProtocolManager_setmeta`.`isOffline`, `ProtocolManager_setmeta`.`offlineReason`, `ProtocolManager_setmeta`.`discontinued`, `ProtocolManager_setmeta`.`invisible`, `ProtocolManager_setmeta`.`isSponsored`, `ProtocolManager_setmeta`.`isOutdated`, `ProtocolManager_setmeta`.`lastUID`, `ProtocolManager_setmeta`.`companyNumberLabel`, `ProtocolManager_setmeta`.`companyNumberRegex`, `ProtocolManager_setmeta`.`customBundleId`, `ProtocolManager_setmeta`.`betaEnabled`, `ProtocolManager_setmeta`.`restrictedMD5`, `ProtocolManager_setmeta`.`createdOn`, `ProtocolManager_setmeta`.`lastUpdated`, `ProtocolManager_setmeta`.`createdBy_id`, `ProtocolManager_setmeta`.`updatedBy_id` FROM `ProtocolManager_setmeta` WHERE `ProtocolManager_setmeta`.`id` = 586 LIMIT 21; args=(586,)
2021-06-08 20:30:05,610 [DEBUG] django.db.backends: (0.005) SELECT COUNT(*) AS `__count` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/314.pdf'; args=('1129/314.pdf',)
2021-06-08 20:30:05,617 [DEBUG] django.db.backends: (0.007) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/314.pdf' LIMIT 1; args=('1129/314.pdf',)
2021-06-08 20:30:05,690 [DEBUG] django.db.backends: (0.004) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1116/312.pdf' LIMIT 1; args=('1116/312.pdf',)
2021-06-08 20:30:05,700 [DEBUG] django.db.backends: (0.009) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1109/310.pdf' LIMIT 1; args=('1109/310.pdf',)
2021-06-08 20:30:05,705 [DEBUG] django.db.backends: (0.004) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1101/306.pdf' LIMIT 1; args=('1101/306.pdf',)
2021-06-08 20:30:05,711 [DEBUG] django.db.backends: (0.006) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1095/304.pdf' LIMIT 1; args=('1095/304.pdf',)
2021-06-08 20:30:05,721 [DEBUG] django.db.backends: (0.009) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1089/301.pdf' LIMIT 1; args=('1089/301.pdf',)
2021-06-08 20:30:05,725 [DEBUG] django.db.backends: (0.004) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1083/296.pdf' LIMIT 1; args=('1083/296.pdf',)
2021-06-08 20:30:05,734 [DEBUG] django.db.backends: (0.008) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1080/292.pdf' LIMIT 1; args=('1080/292.pdf',)
2021-06-08 20:30:05,739 [DEBUG] django.db.backends: (0.004) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1077/290.pdf' LIMIT 1; args=('1077/290.pdf',)
2021-06-08 20:30:05,746 [DEBUG] django.db.backends: (0.007) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1075/289.pdf' LIMIT 1; args=('1075/289.pdf',)
2021-06-08 20:30:05,754 [DEBUG] django.db.backends: (0.007) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1072/287.pdf' LIMIT 1; args=('1072/287.pdf',)
2021-06-08 20:30:05,762 [DEBUG] django.db.backends: (0.007) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1066/285.pdf' LIMIT 1; args=('1066/285.pdf',)
2021-06-08 20:30:05,767 [DEBUG] django.db.backends: (0.004) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/307.pdf' LIMIT 1; args=('1129/307.pdf',)
2021-06-08 20:30:05,937 [DEBUG] django.db.backends: (0.169) SELECT `ProtocolManager_contentfile`.`id`, `ProtocolManager_contentfile`.`name`, `ProtocolManager_contentfile`.`set_id`, `ProtocolManager_contentfile`.`location`, `ProtocolManager_contentfile`.`extractedText`, `ProtocolManager_contentfile`.`pages`, `ProtocolManager_contentfile`.`md5`, `ProtocolManager_contentfile`.`fileSize`, `ProtocolManager_contentfile`.`createdOn`, `ProtocolManager_contentfile`.`createdBy_id`, `ProtocolManager_contentfile`.`lastUpdated`, `ProtocolManager_contentfile`.`updatedBy_id` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/307.pdf'; args=('1129/307.pdf',)
2021-06-08 20:30:05,940 [DEBUG] django.db.backends: (0.001) SELECT `ProtocolManager_set`.`id`, `ProtocolManager_set`.`setMeta_id`, `ProtocolManager_set`.`originSet_id`, `ProtocolManager_set`.`majorVersion`, `ProtocolManager_set`.`minorVersion`, `ProtocolManager_set`.`revisionVersion`, `ProtocolManager_set`.`acknowledgmentSetID`, `ProtocolManager_set`.`receiptAckText`, `ProtocolManager_set`.`pubLevel`, `ProtocolManager_set`.`minOSiOS`, `ProtocolManager_set`.`minOSAndroid`, `ProtocolManager_set`.`minVersioniOS`, `ProtocolManager_set`.`minVersionAndroid`, `ProtocolManager_set`.`createdOn`, `ProtocolManager_set`.`createdBy_id`, `ProtocolManager_set`.`lastUpdated`, `ProtocolManager_set`.`updatedBy_id` FROM `ProtocolManager_set` WHERE `ProtocolManager_set`.`id` = 1130 LIMIT 21; args=(1130,)
2021-06-08 20:30:05,942 [DEBUG] django.db.backends: (0.001) SELECT `ProtocolManager_setmeta`.`id`, `ProtocolManager_setmeta`.`region_id`, `ProtocolManager_setmeta`.`setName`, `ProtocolManager_setmeta`.`displayName`, `ProtocolManager_setmeta`.`isOffline`, `ProtocolManager_setmeta`.`offlineReason`, `ProtocolManager_setmeta`.`discontinued`, `ProtocolManager_setmeta`.`invisible`, `ProtocolManager_setmeta`.`isSponsored`, `ProtocolManager_setmeta`.`isOutdated`, `ProtocolManager_setmeta`.`lastUID`, `ProtocolManager_setmeta`.`companyNumberLabel`, `ProtocolManager_setmeta`.`companyNumberRegex`, `ProtocolManager_setmeta`.`customBundleId`, `ProtocolManager_setmeta`.`betaEnabled`, `ProtocolManager_setmeta`.`restrictedMD5`, `ProtocolManager_setmeta`.`createdOn`, `ProtocolManager_setmeta`.`lastUpdated`, `ProtocolManager_setmeta`.`createdBy_id`, `ProtocolManager_setmeta`.`updatedBy_id` FROM `ProtocolManager_setmeta` WHERE `ProtocolManager_setmeta`.`id` = 586 LIMIT 21; args=(586,)
2021-06-08 20:30:05,947 [DEBUG] django.db.backends: (0.004) SELECT COUNT(*) AS `__count` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/307.pdf'; args=('1129/307.pdf',)
2021-06-08 20:30:05,952 [DEBUG] django.db.backends: (0.004) SELECT (1) AS `a` FROM `ProtocolManager_contentfile` WHERE `ProtocolManager_contentfile`.`location` = '1129/307.pdf' LIMIT 1; args=('1129/307.pdf',)
like image 681
Oded Avatar asked Jun 08 '21 23:06

Oded


People also ask

What are the downsides of using an extra query in Django?

The main downside is that if you refer to some table alias of the queryset in the raw SQL, then it is possible that Django might change that alias (for example, when the queryset is used as a subquery in yet another query). You should be very careful whenever you use extra ().

How do I annotate a queryset in Django?

Each argument to annotate () is an annotation that will be added to each object in the QuerySet that is returned. The aggregation functions that are provided by Django are described in Aggregation Functions below. Annotations specified using keyword arguments will use the keyword as the alias for the annotation.

What is an optional Q OBJECT in Django?

An optional Q object that’s used to filter the rows that are aggregated. See Conditional aggregation and Filtering on annotations for example usage. New in Django 4.0. An optional argument that allows specifying a value to use as a default value when the queryset (or grouping) contains no entries.

What is a queryset in REST framework?

The root QuerySet provided by the Manager describes all objects in the database table. Usually, though, you'll need to select only a subset of the complete set of objects. The default behavior of REST framework's generic list views is to return the entire queryset for a model manager.


1 Answers

You might have a corrupted index on the ProtocolManager_contentfile table.

This question seems to have been dealing with a similar phenomenon. If this is the cause of your problem, you would probably need to REINDEX your database.

like image 82
MattRowbum Avatar answered Oct 16 '22 09:10

MattRowbum