Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Commit manually in Django data migration

I'd like to write a data migration where I modify all rows in a big table in smaller batches in order to avoid locking issues. However, I can't figure out how to commit manually in a Django migration. Everytime I try to run commit I get:

TransactionManagementError: This is forbidden when an 'atomic' block is active.

AFAICT, the database schema editor always wraps Postgres migrations in an atomic block.

Is there a sane way to break out of the transaction from within the migration?

My migration looks like this:

def modify_data(apps, schema_editor):
    counter = 0
    BigData = apps.get_model("app", "BigData")
    for row in BigData.objects.iterator():
        # Modify row [...]
        row.save()
        # Commit every 1000 rows
        counter += 1
        if counter % 1000 == 0:
            transaction.commit()
    transaction.commit()

class Migration(migrations.Migration):
    operations = [
        migrations.RunPython(modify_data),
    ]

I'm using Django 1.7 and Postgres 9.3. This used to work with South and older versions of Django.

like image 342
Pankrat Avatar asked Jul 06 '15 13:07

Pankrat


People also ask

What is the difference between Makemigrations and migrate in Django?

makemigrations is responsible for packaging up your model changes into individual migration files - analogous to commits - and migrate is responsible for applying those to your database.

How do I fix migration issues in Django?

If dropping your database is not an option then the next best solution is to find the last working state of the database and then create new migration files. Find the last good state where your database was working. Delete all the files from the migrations folder. Uncomment the changes which you did in step 2.


1 Answers

The best workaround I found is manually exiting the atomic scope before running the data migration:

def modify_data(apps, schema_editor):
    schema_editor.atomic.__exit__(None, None, None)
    # [...]

In contrast to resetting connection.in_atomic_block manually this allows using atomic context manager inside the migration. There doesn't seem to be a much saner way.

One can contain the (admittedly messy) transaction break out logic in a decorator to be used with the RunPython operation:

def non_atomic_migration(func):
  """
  Close a transaction from within code that is marked atomic. This is
  required to break out of a transaction scope that is automatically wrapped
  around each migration by the schema editor. This should only be used when
  committing manually inside a data migration. Note that it doesn't re-enter
  the atomic block afterwards.
  """
  @wraps(func)
  def wrapper(apps, schema_editor):
      if schema_editor.connection.in_atomic_block:
          schema_editor.atomic.__exit__(None, None, None)
      return func(apps, schema_editor)
  return wrapper

Update

Django 1.10 will support non-atomic migrations.

like image 118
Pankrat Avatar answered Oct 21 '22 12:10

Pankrat