I've got an Ecto migration where I'd like to modify some columns but also migrate some data. For example: <pre class="prettyprint lang-elixir prettyprint-override"><code>import Ecto.Query defmodule MyApp.Repo.Migrations.AddStatus do alter table(:foo) do add(:status, :text) end foos = from(f in MyApp.Foo, where: ...) |> MyApp.Repo.all Enum.each(foos, fn(foo) -> # There's then some complex logic here to work # out how to set the status based on other attributes of `foo` end) end </code></pre> Now, the problem here is that by calling <code>MyApp.Repo.all</code> the migration essentially uses a separate database connection to the one which the <code>alter table...</code> statement used (EDIT: This assumption is false, see the accepted answer). Consequently, no <code>status</code> column so the whole migration blows up! Note, we're using a <code>postgres</code> database so DDL statements are transactional. I could do this as two separate migrations or a <code>mix</code> task to set the data, leaving the migration only for the schema change, but would prefer not to in order to ensure data consistency. Any thoughts on how to use the same database connection for <code>MyApp.Repo</code> queries in this manner? EDIT: Note, I'm working on a small set of data and downtime is acceptable in my use case. See José's response below for some good advice when this is not the case.

You can execute the current pending changes in a migration by calling <code>Ecto.Migration.flush/0</code>. Any code after that will have the status field available. <pre class="prettyprint"><code>defmodule MyApp.Repo.Migrations.AddStatus do alter table(:foo) do add(:status, :text) end flush() foos = from(f in MyApp.Foo, where: ...) |> MyApp.Repo.all ... end </code></pre>

Using a Repo in an Ecto migration

Tags:

database-migration

elixir

ecto

I've got an Ecto migration where I'd like to modify some columns but also migrate some data. For example:

import Ecto.Query

defmodule MyApp.Repo.Migrations.AddStatus do
  alter table(:foo) do
    add(:status, :text)
  end

  foos = from(f in MyApp.Foo, where: ...)
         |> MyApp.Repo.all

  Enum.each(foos, fn(foo) ->
    # There's then some complex logic here to work 
    # out how to set the status based on other attributes of `foo`
  end)
end

Now, the problem here is that by calling MyApp.Repo.all the migration essentially uses a separate database connection to the one which the alter table... statement used (EDIT: This assumption is false, see the accepted answer). Consequently, no status column so the whole migration blows up! Note, we're using a postgres database so DDL statements are transactional.

I could do this as two separate migrations or a mix task to set the data, leaving the migration only for the schema change, but would prefer not to in order to ensure data consistency.

Any thoughts on how to use the same database connection for MyApp.Repo queries in this manner?

EDIT: Note, I'm working on a small set of data and downtime is acceptable in my use case. See José's response below for some good advice when this is not the case.

243

asked Jan 22 '18 12:01

seddy

2 Answers

You can execute the current pending changes in a migration by calling Ecto.Migration.flush/0. Any code after that will have the status field available.

defmodule MyApp.Repo.Migrations.AddStatus do
  alter table(:foo) do
    add(:status, :text)
  end

  flush()

  foos = from(f in MyApp.Foo, where: ...)
         |> MyApp.Repo.all

  ...
end

answered Sep 29 '22 11:09

Dogbert

Generally speaking, doing data migration and changing the DDL at the same time is a bad practice. If this is a live system and the migration takes long, you may generate a lot of contention for a long period of time.

If your application is still processing requests, new entries can be added while the data is processed, and those won't be processed!

There are different ways to tackle this, depending on the use case, but they usually require a step by step approach. For example, if you are adding a new column:

the first step is to introduce the new column in the database and make sure that the new column is populated when new entries are created. This step is only to guarantee that all future entries will be correctly populated.
then a second step is to migrate the old data
finally, you can put the new data live as you can assume all of the entries are properly populated

answered Sep 29 '22 11:09

José Valim

Related questions
                            
                                How can I use httpotion behind the proxy?
                            
                                Getting a sibling process in Elixir
                            
                                Is it assumed that you run mix deps.compile after updating config.exs?
                            
                                Getting error using 'belongs_to' in Ecto phoenix.gen.html
                            
                                How do I used matched arguments in elixir functions
                            
                                production env does not find OTP module :httpc when using exrm
                            
                                Optional routing parameters in Phoenix Framework
                            
                                Rendering JSON in Phoenix
                            
                                Pattern matching and default parameters
                            
                                How to debug eex template and @ variables?
                            
                                Reading integers from console
                            
                                Generate infinite sequence in Elixir
                            
                                Why do we need a function "capture operator" in Elixir?
                            
                                Elixir Postgrex using uuid
                            
                                error when saving a file in an elixir project (elixirLS.projectDir)
                            
                                Elixir's Range for arithmetic progression
                            
                                How to dynamically call an operator in Elixir
                            
                                Anonymous function cannot mix clauses
                            
                                Pattern-matching a map with string as key
                            
                                Concise way to run code 0 to N times in Elixir?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With