Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sync new ActiveStorage mirrors?

Starting with ActiveStorage you can know define mirrors for storing your files.

local:
  service: Disk
  root: <%= Rails.root.join("storage") %>

amazon:
  service: S3
  access_key_id: <%= Rails.application.credentials.dig(:aws, :access_key_id) %>
  secret_access_key: <%= Rails.application.credentials.dig(:aws, :secret_access_key) %>
  region: us-east-1
  bucket: mybucket

mirror:
  service: Mirror
  primary: local
  mirrors:
    - amazon
    - another_mirror

If you add a mirror after a certain point of time you have to take care about copying all files e.g. from "local" to "amazon" or "another_mirror".

  1. Is there a convenient method to keep the files in sync?
  2. Or method run a validation to check if all files are avaiable on each service?
like image 402
Chris Avatar asked Oct 07 '18 00:10

Chris


2 Answers

I have a couple of solutions that might work for you, one for Rails <= 6.0 and one for Rails >= 6.1:

Firstly, you need to iterate through your ActiveStorage blobs:

ActiveStorage::Blob.all.each do |blob|
  # work with blob
end

then...

  1. Rails <= 6.0

    You will need the blob's key, checksum, and the local file on disk.

    local_file = ActiveStorage::Blob.service.primary.path_for blob.key
    
    # I'm picking the first mirror as an example,
    # but you can select a specific mirror if you want
    mirror = blob.service.mirrors.first
    
    mirror.upload blob.key, File.open(local_file), checksum: blob.checksum
    

    You may also want to avoid uploading a file if it already exists on the mirror. You can do that by doing this:

    mirror = blob.service.mirrors.first
    
    # If the file doesn't exist on the mirror, upload it
    unless mirror.exist? blob.key
      # Upload file to mirror
    end
    

    Putting it together, a rake task might look like:

    # lib/tasks/active_storage.rake
    
    namespace :active_storage do
    
      desc 'Ensures all files are mirrored'
      task mirror_all: [:environment] do
    
      # Iterate through each blob
      ActiveStorage::Blob.all.each do |blob|
    
        # We assume the primary storage is local
        local_file = ActiveStorage::Blob.service.primary.path_for blob.key
    
        # Iterate through each mirror
        blob.service.mirrors.each do |mirror|
    
          # If the file doesn't exist on the mirror, upload it
          mirror.upload(blob.key, File.open(local_file), checksum: blob.checksum) unless mirror.exist? blob.key
    
          end
        end
      end
    end
    

    You may run into a situation like @Rystraum mentioned where you might need to mirror from somewhere other than the local disk. In this case, the rake task could look like this:

    # lib/tasks/active_storage.rake
    
    namespace :active_storage do
    
      desc 'Ensures all files are mirrored'
      task mirror_all: [:environment] do
    
        # All services in our rails configuration
        all_services = [ActiveStorage::Blob.service.primary, *ActiveStorage::Blob.service.mirrors]
    
        # Iterate through each blob
        ActiveStorage::Blob.all.each do |blob|
    
          # Select services where file exists
          services = all_services.select { |file| file.exist? blob.key }
    
          # Skip blob if file doesn't exist anywhere
          next unless services.present?
    
          # Select services where file doesn't exist
          mirrors = all_services - services
    
          # Open the local file (if one exists)
          local_file = File.open(services.find{ |service| service.is_a? ActiveStorage::Service::DiskService }.path_for blob.key) if services.select{ |service| service.is_a? ActiveStorage::Service::DiskService }.any?
    
          # Upload local file to mirrors (if one exists)
          mirrors.each do |mirror|
            mirror.upload blob.key, local_file, checksum: blob.checksum
          end if local_file.present?
    
          # If no local file exists then download a remote file and upload it to the mirrors (thanks @Rystraum)
          services.first.open blob.key, checksum: blob.checksum do |temp_file|
            mirrors.each do |mirror|
              mirror.upload blob.key, temp_file, checksum: blob.checksum
            end
          end unless local_file.present?
    
        end
      end
    end
    

    While the first rake task answers the OP's question, the latter is much more versatile:

    • It can be used with any combination of services
    • A DiskService is not required
    • Uploading via DiskServices are prioritized
    • Avoids extra exists? calls as we only call it once per service per blob
  2. Rails > 6.1

    Its super easy, just call this on each blob...

    blob.mirror_later
    

    Wrapping it up as a rake task looks like:

    # lib/tasks/active_storage.rake
    
    namespace :active_storage do
    
      desc 'Ensures all files are mirrored'
      task mirror_all: [:environment] do
        ActiveStorage::Blob.all.each do |blob|
          blob.mirror_later
        end
      end
    end
    
like image 167
Tayden Avatar answered Sep 20 '22 14:09

Tayden


Everything is stored according to ActiveStorage's keys, so as long as your bucket names and file names aren't changed in the transfer, you can just copy everything over to the new service. See this post for how to copy stuff over.

like image 24
ryanhkerr Avatar answered Sep 19 '22 14:09

ryanhkerr