Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Running loaddata on heroku without adding the data file to repository




I need to run a manage.py loaddata command to import some data into the database of my heroku instance and heroku's ethereal file system presents some problems in this regard. I really would prefer not to have to add the data files to my heroku repository and push an update every single time that I want to run loaddata (since I'll need to do this on a regular basis with different files for different heroku instances running the same code base.) Is there a way to either a) run loaddata on a remote instance without having the data file residing on the instance's file system, maybe either by piping the data in or referencing a local file or b) upload a file and run loaddata in the same session so that the file can exist on the instance while the command is being executed? (I realize that it will disappear as soon as the interactive session ends)

like image 230
B Robster Avatar asked Feb 23 '13 14:02

B Robster

3 Answers

(...several years later)

@Ben Roberts' approach is sensible but note that several years later, all the obstructions have been fixed:

  • Django load data now accepts input from stdin using a -
  • Heroku run command pipes are fixed (the issue linked by Ben is now closed)

So you don't need a custom management command. Loading data from a local file into Heroku should now be as simple as:

$ cat your-data-file.json | heroku run --no-tty -a <your-app> -- python manage.py loaddata --format=json -

Bonus: for the equal and opposite action, you can dump data using the answer here.

[Edit: --no-tty option added thanks to @rgov]

like image 172
thclark Avatar answered Nov 13 '22 17:11


Here's what a came up with (using my (a) idea with piping from stdin), but it doesn't work due to this issue with heroku run: https://github.com/heroku/heroku/issues/256

A management command to wrap loaddata in order to get it to use stdin (it could just be written as a python scripts if you set up the django eviron):

# someapp/management/commands/loaddata_stdin.py

import os
import sys
from django.core.management import BaseCommand, call_command

class Command(BaseCommand):

    def transfer_stdin_to_tempfile(self):
        content = sys.stdin.read() # could use readlines if content is expected to be huge
        outfile = open ('temp.json', 'w')
        return outfile.name

    def handle(self, *args, **options):
        tempfile_name = self.transfer_stdin_to_tempfile()
        call_command('loaddata', tempfile_name, traceback=True )


$ cat some_dump.json | heroku run python manage.py loaddata_stdin.py
like image 29
B Robster Avatar answered Nov 13 '22 17:11

B Robster

Heroku's PG Backups add-on can help you with that (perhaps it didn't exist at this time last year): https://devcenter.heroku.com/articles/heroku-postgres-import-export

The tutorial describes, in fairly straightforward terms, how to use pg_dump to create the sql dump (adding the commands here in case the link changes):

$ pg_dump -Fc --no-acl --no-owner -h localhost -U <your username> mydb > mydb.dump

I personally uploaded mydb.dump to a Dropbox folder and then ran the pgbackups command:

$ heroku pgbackups:restore <database url> '<url for mydb.dump>'

I tried your method and it worked, but ran into some problems as the filesize got bigger.

like image 45
3cheesewheel Avatar answered Nov 13 '22 16:11
