Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rails, how to migrate large amount of data?

I have a Rails 3 app running an older version of Spree (an open source shopping cart). I am in the process of updating it to the latest version. This requires me to run numerous migrations on the database to be compatible with the latest version. However the apps current database is roughly around 300mb and to run the migrations on my local machine (mac os x 10.7, 4gb ram, 2.4GHz Core 2 Duo) takes over three days to complete.

I was able to decrease this time to only 16 hours using an Amazon EC2 instance (High-I/O On-Demand Instances, Quadruple Extra Large). But 16 hours is still too long as I will have to take down the site to perform this update.

Does anyone have any other suggestions to lower this time? Or any tips to increase the performance of the migrations?

FYI: using Ruby 1.9.2, and Ubuntu on the Amazon instance.

like image 878
akaDanPaul Avatar asked Jul 20 '12 23:07

akaDanPaul


People also ask

How long can data migration take?

Depending on volumes of data and differences between source and target locations, migration can take from some 30 minutes to months and even years.

How do I migrate a specific migration in rails?

To run a specific migration up or down, use db:migrate:up or db:migrate:down . The version number in the above commands is the numeric prefix in the migration's filename. For example, to migrate to the migration 20160515085959_add_name_to_users. rb , you would use 20160515085959 as the version number.

How rails db Migrate works?

When you run db:migrate, rails will check a special table in the database which contains the timestamp of the last migration applied to the database. It will then apply all of the migrations with timestamps after that date and update the database table with the timestamp of the last migration.


1 Answers

  • Dropping indices beforehand and adding them again afterwards is a good idea.

  • Also replacing .where(...).each with .find_each and perhaps adding transactions could help, as already mentioned.

  • Replace .save! with .save(:validate => false), because during the migrations you are not getting random inputs from users, you should be making known-good updates, and validations account for much of the execution time. Or using .update_attribute would also skip validations where you're only updating one field.

  • Where possible, use fewer AR objects in a loop. Instantiating and later garbage collecting them takes CPU time and uses more memory.

like image 172
sockmonk Avatar answered Oct 25 '22 20:10

sockmonk