Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django Selective Dumpdata

Is it possible to selectively filter which records Django's dumpdata management command outputs? I have a few models, each with millions of rows, and I only want to dump records in one model fitting a specific criteria, as well as all foreign-key linked records referencing any of those records.

Consider this use-case. Say I had a production database where my User model has millions of records. I have several other models (Log, Transaction, Purchase, Bookmarks, etc) all referencing the User model. I want to do development on my Django app, and I want to test using realistic data. However, my production database is so enormous, I can't realistically take a snapshot of the entire thing and load it locally. So ideally, I'd want to use dumpdata to dump 50 random User records, and all related records to JSON, and use that to populate a development database.

Is there an easy way to accomplish this?

like image 581
Cerin Avatar asked Nov 29 '11 15:11

Cerin


People also ask

What is Django Dumpdata?

As of version 1.1 and greater, the Django dumpdata management command allows you to dump data from individual tables: ./manage.py dumpdata myapp1 myapp2.my_model. You can also separate multiple apps and models on the command line. Here's the canonical definition: django-admin dumpdata [app_label[.

What is Dumpdata?

dumpdata command It is a django management command, which can be use to backup(export) you model instances or whole database.

What is dumps in Django?

Django dumpdata is a command used for dumping data from database to fixture files. Output from dumpdata can be in various file formats. The received file can be used for the loaddata command.

How use Django load data?

You can load data by calling manage.py loaddata <fixturename> , where <fixturename> is the name of the fixture file you've created. Each time you run loaddata , the data will be read from the fixture and reloaded into the database.


2 Answers

I think django-fixture-magic might be worth a look at.

You'll find some additional background info in Scrubbing your Django database.

like image 84
arie Avatar answered Sep 22 '22 07:09

arie


This snippet might be helpful for you (it follows relationships and serializes them):

http://djangosnippets.org/snippets/918/

You could use also that management command and override the default managers for whichever models you would like to return custom querysets.

like image 26
Phil Avery Avatar answered Sep 22 '22 07:09

Phil Avery