Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does OpenURI treat files under 10kb in size as StringIO?

Tags:

I fetch images with open-uri from a remote website and persist them on my local server within my Ruby on Rails application. Most of the images were shown without a problem, but some images just didn't show up.

After a very long debugging-session I finally found out (thanks to this blogpost) that the reason for this is that the class Buffer in the open-uri-libary treats files with less than 10kb in size as IO-objects instead of tempfiles.

I managed to get around this problem by following the answer from Micah Winkelspecht to this StackOverflow question, where I put the following code within a file in my initializers:

require 'open-uri' # Don't allow downloaded files to be created as StringIO. Force a tempfile to be created. OpenURI::Buffer.send :remove_const, 'StringMax' if OpenURI::Buffer.const_defined?('StringMax') OpenURI::Buffer.const_set 'StringMax', 0 

This works as expected so far, but I keep wondering, why they put this code into the library in the first place? Does anybody know a specific reason, why files under 10kb in size get treated as StringIO ?

Since the above code practically resets this behaviour globally for my entire application, I just want to make sure that I am not breaking anything else.

like image 289
klaffenboeck Avatar asked May 08 '12 10:05

klaffenboeck


1 Answers

When one does network programming, you allocate a buffer of a reasonably large size and send and read units of data which will fit in the buffer. However, when dealing with files (or sometimes things called BLOBs) you cannot assume that the data will fit into your buffer. So, you need special handling for these large streams of data.

(Sometimes the units of data which fit into the buffer are called packets. However, packets are really a layer 4 thing, like frames are at layer 2. Since this is happening a layer 7, they might better be called messages.)

For replies larger than 10K, the open-uri library is setting up the extra overhead to write to a stream objects. When under the StringMax size, it just includes the string in the message, since it knows it can fit in the buffer.

like image 73
Marlin Pierce Avatar answered Sep 21 '22 12:09

Marlin Pierce