Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is it taking so long to retrieve a file URL from S3 using CarrierWave and Fog?

I'm using CarrierWave (0.9.0), Fog (1.14.0) and S3 to store user avatars. It seems to be taking a long time to determine the avatar URL for a given user. Subsequent calls have a greatly reduced time.

config/initializers/fog.rb

CarrierWave.configure do |config|
  config.fog_credentials = {
    provider: 'AWS',
    aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
    aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
  }
  config.fog_directory = ENV['AWS_S3_BUCKET_AVATAR']
  config.fog_public = false
end

user.rb

mount_uploader :avatar, ImageUploader

Console

irb(main):002:0> Benchmark.measure { user.avatar_url }
=>   0.500000   0.020000   0.520000 (  0.537884)

irb(main):003:0> Benchmark.measure { user.avatar_url }
=>   0.000000   0.000000   0.000000 (  0.001830)

irb(main):004:0> Benchmark.measure { user.avatar_url }
=>   0.000000   0.000000   0.000000 (  0.001454)

irb(main):005:0> Benchmark.measure { user.avatar_url }
=>   0.000000   0.000000   0.000000 (  0.000998)

New Relic reports that sometimes user.avatar_url takes up to 1 second. What could be causing this to be so slow? There's a discussion on slow URL generation for public files, but my avatars are not public.


Update 1:

Explicitly requiring Fog and CarrierWave before the first call makes no difference, as false is returned, indicating that they're already loaded.

irb(main):002:0> require 'carrierwave'
=> false
irb(main):003:0> require 'fog'
=> false
irb(main):004:0> Benchmark.measure { user.avatar_url }
=>   0.510000   0.030000   0.540000 (  1.627774)

Update 2:

Here's the uploader:

# encoding: utf-8

class ImageUploader < CarrierWave::Uploader::Base

  include CarrierWave::RMagick

  storage :fog

  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
  end

  version :s_32 do
   process resize_to_fill: [32, 32]
  end

  version :s_40 do
   process resize_to_fill: [40, 40]
  end

  version :s_50 do
   process resize_to_fill: [50, 50]
  end

  version :s_115_120 do
   process resize_to_fill: [115, 120]
  end

  version :s_128 do
   process resize_to_fill: [128, 128]
  end

  def extension_white_list
    %w(jpg jpeg gif png)
  end

end

Update 3:

user.avatar.url doesn't seem to make a difference:

irb(main):003:0> Benchmark.measure { user.avatar.url }
=>   0.500000   0.030000   0.530000 (  0.926975)
like image 439
ben Avatar asked Jul 30 '13 15:07

ben


2 Answers

I think the fog require may still be at issue (though it is less obvious now). Since fog has so very many different things within it, we made a choice long ago to defer loading many of the dependencies until they were needed. This has the benefit of speeding up 'fog', but can have the detriment of slowing down the first time certain things happen. Not sure how I forgot about this part, but in doing some small benchmarking on my machine I can certainly see a slow down when taking this into consideration.

To get around this, you can change the related requiring benchmark above to something like:

require 'benchmark'
require 'fog'
Fog::Storage.new(
  provider: 'AWS',
  aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
  aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
)
Benchmark.measure { ... }

It may seem a bit odd in that you don't use that connection, but I set it up to defer loading S3 specifics until you do that (so that you don't, for instance, have to load S3 specifics in order to use EC2). By initializing a connection at some earlier time, however, you avoid that overhead. Hopefully that will at least get you closer to where you want to be.

like image 51
geemus Avatar answered Oct 22 '22 14:10

geemus


I spent some time on this. While I haven't figured out the root cause, here are some findings. You can start from what I've done and hopefully you don't need to worry about this too much.

  • Based on what you said and my experiments. This slowness only happens when you call user.avatar_url the first. It's fast if you call user.avatar_url or another_user.avatar_url after the first call. Basically only one request is affected by this 'issue', so I think this is not a big problem.
  • What are the difference between the first and subsequent calls? I suggest you use ruby-prof to check the difference. I am not an expert but from the report you will see the first run loads more stuff than the second run. ruby-prof provides some other reports which might be helpful to you, especially the tree report.

    • first run: https://gist.github.com/wanghq/6199526
    • second run: https://gist.github.com/wanghq/6199530
  • I thought CarrierWave/Fog might need to access S3 to make sure the url does exist. This was ruled out by using an incorrect S3 bucket.

The above report is generated by below code.

user = User.first
RubyProf.start
user.avatar_url
results = RubyProf.stop

File.open "/tmp/profile-graph1.txt", 'w' do |file|
  RubyProf::FlatPrinter.new(results).print(file)
end

RubyProf.start
user.avatar_url
results = RubyProf.stop

File.open "/tmp/profile-graph2.txt", 'w' do |file|
  RubyProf::FlatPrinter.new(results).print(file)
end
like image 43
wanghq Avatar answered Oct 22 '22 12:10

wanghq