The MySQL database powering our Django site has developed some integrity problems; e.g. foreign keys that refer to nonexistent rows. I won't go into how we got into this mess, but I'm now looking at how to fix it.
Basically, I'm looking for a script that scans all models in the Django site, and checks whether all foreign keys and other constraints are correct. Hopefully, the number of problems will be small enough so they can be fixed by hand.
I could code this up myself but I'm hoping that somebody here has a better idea.
I found django-check-constraints but it doesn't quite fit the bill: right now, I don't need something to prevent these problems, but to find them so they can be fixed manually before taking other steps.
Other constraints:
(Later, we will convert to InnoDB for proper transaction support, and maybe foreign key constraints on the database level, to prevent similar problems in the future. But that's not the topic of this question.)
I whipped up something myself. The management script below should be saved in myapp/management/commands/checkdb.py
. Make sure that intermediate directories have an __init__.py
file.
Usage: ./manage.py checkdb
for a full check; use --exclude app.Model
or -e app.Model
to exclude the model Model
in the app app
.
from django.core.management.base import BaseCommand, CommandError
from django.core.management.base import NoArgsCommand
from django.core.exceptions import ObjectDoesNotExist
from django.db import models
from optparse import make_option
from lib.progress import with_progress_meter
def model_name(model):
return '%s.%s' % (model._meta.app_label, model._meta.object_name)
class Command(BaseCommand):
args = '[-e|--exclude app_name.ModelName]'
help = 'Checks constraints in the database and reports violations on stdout'
option_list = NoArgsCommand.option_list + (
make_option('-e', '--exclude', action='append', type='string', dest='exclude'),
)
def handle(self, *args, **options):
# TODO once we're on Django 1.2, write to self.stdout and self.stderr instead of plain print
exclude = options.get('exclude', None) or []
failed_instance_count = 0
failed_model_count = 0
for app in models.get_apps():
for model in models.get_models(app):
if model_name(model) in exclude:
print 'Skipping model %s' % model_name(model)
continue
fail_count = self.check_model(app, model)
if fail_count > 0:
failed_model_count += 1
failed_instance_count += fail_count
print 'Detected %d errors in %d models' % (failed_instance_count, failed_model_count)
def check_model(self, app, model):
meta = model._meta
if meta.proxy:
print 'WARNING: proxy models not currently supported; ignored'
return
# Define all the checks we can do; they return True if they are ok,
# False if not (and print a message to stdout)
def check_foreign_key(model, field):
foreign_model = field.related.parent_model
def check_instance(instance):
try:
# name: name of the attribute containing the model instance (e.g. 'user')
# attname: name of the attribute containing the id (e.g. 'user_id')
getattr(instance, field.name)
return True
except ObjectDoesNotExist:
print '%s with pk %s refers via field %s to nonexistent %s with pk %s' % \
(model_name(model), str(instance.pk), field.name, model_name(foreign_model), getattr(instance, field.attname))
return check_instance
# Make a list of checks to run on each model instance
checks = []
for field in meta.local_fields + meta.local_many_to_many + meta.virtual_fields:
if isinstance(field, models.ForeignKey):
checks.append(check_foreign_key(model, field))
# Run all checks
fail_count = 0
if checks:
for instance in with_progress_meter(model.objects.all(), model.objects.count(), 'Checking model %s ...' % model_name(model)):
for check in checks:
if not check(instance):
fail_count += 1
return fail_count
I'm making this a community wiki because I welcome any and all improvements to my code!
Thomas' answer is great but is now a bit out of date. I have updated it as a gist to support Django 1.8+.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With