Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Paperclip uploads for office files (docx,pptx) are being downloaded as zip files?

I'm using the following for file uploading: Rails 3.2, Paperclip (3.0.4), aws-sdk (1.5.2) & jQuery-File-Upload

Problem is office files like (pptx) are being downloaded as zip files not pptx files. Here is what I see in the logs:

Started POST
Processing by AttachmentsController#create as JS
  Parameters: {"files"=>[#<ActionDispatch::Http::UploadedFile:0x007fa1d5bee960 @original_filename="test1.pptx", @content_type="application/vnd.openxmlformats-officedocument.presentationml.presentation", @headers="Content-Disposition: form-data; name=\"files[]\"; filename=\"test1.pptx\"\r\nContent-Type: application/vnd.openxmlformats-officedocument.presentationml.presentation\r\n", @tempfile=#<File:/var/folders/rm/89l_3yt93g31p22738hqydmr0000gn/T/RackMultipart20120529-10443-1ljhigq>>]}
.....


SQL (1.4ms)  INSERT INTO "attachments" ("attachment_content_type", "attachment_file_name", "attachment_file_size", "attachment_file_title", "attachment_updated_at", "created_at", "deleted", "room_id", "pinned", "updated_at", "user_id") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11) RETURNING "id"  [["attachment_content_type", "application/zip"], ["attachment_file_name", "test1_1338339249.pptx"], ["attachment_file_size", 150329], ["attachment_file_title", "test1.pptx"], ["attachment_updated_at", Wed, 30 May 2012 00:54:09 UTC +00:00], ["created_at", Wed, 30 May 2012 00:54:09 UTC +00:00], ["deleted", false], ["room_id", 20], ["pinned", false], ["updated_at", Wed, 30 May 2012 00:54:09 UTC +00:00], ["user_id", 1]]
[paperclip] Saving attachments.
[paperclip] saving /development/private/rooms/20/user_uploaded_files/test1_1338339249.pptx
Command :: file -b --mime '/var/folders/rm/89l_3yt93g31p22738hqydmr0000gn/T/RackMultipart20120529-10443-1ljhigq20120529-10443-1lr2yg2'
[AWS S3 200 1.16513 0 retries] put_object(:acl=>:private,:bucket_name=>"cdn-assets-site-com",:content_type=>"application/zip",:data=>#<Paperclip::FileAdapter:0x007fa1d2540170 @target=#<File:/var/folders/rm/89l_3yt93g31p22738hqydmr0000gn/T/RackMultipart20120529-10443-1ljhigq>, @tempfile=#<File:/var/folders/rm/89l_3yt93g31p22738hqydmr0000gn/T/RackMultipart20120529-10443-1ljhigq20120529-10443-1lr2yg2>>,:key=>"development/private/rooms/20/user_uploaded_files/test1_1338339249.pptx") 

Notice how the file comes in as pptx but when uploaded to AWS S3 goes as a zip file?

like image 881
AnApprentice Avatar asked May 30 '12 01:05

AnApprentice


People also ask

Are DOCX files zip files?

As Microsoft has indicated, the . docx file is actually a . zip file.


3 Answers

Seems like you don't have MIME types registered.

Office files that end in x (Office 2007+) are indeed zipped XML files. Anything that uses normal MIME types will assume it as a zipped file.

MIME types for office 2007+ files

| File |                             MIME type                                   |
+------+-------------------------------------------------------------------------+
|.docx |application/vnd.openxmlformats-officedocument.wordprocessingml.document  |
+------+-------------------------------------------------------------------------+
|.xlsx |application/vnd.openxmlformats-officedocument.spreadsheetml.sheet        |
+------+-------------------------------------------------------------------------+
|.pptx |application/vnd.openxmlformats-officedocument.presentationml.presentation|

In your config/initializers/mime_types.rb file, add the required field, like the example below;

"application/vnd.openxmlformats-officedocument.presentationml.presentation", :pptx

Ironically IE can have difficulty recognising the new MS Office files while other browsers recognise them fine.

In order to get IE working with these files you need to add the mime types to the server config. In Rails this is done in config/initializers/mime_types.rb

Mime::Type.register "application/vnd.openxmlformats-officedocument.wordprocessingml.document", :docx
Mime::Type.register "application/vnd.openxmlformats-officedocument.presentationml.presentation", :pptx
Mime::Type.register "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", :xlsx

If your app is proxied through Apache and Apache serves your static assets you'll also have to configure apache with the new mime types (and restart) as per http://bignosebird.com/apache/a1.shtml

Usually mime types were located at /etc/mime.types but try locate mime.types if you're not sure.

You may refer paperclip adapters.

You may read Description of the default settings for the MimeMap property and for the ScriptMaps property in IIS , Office 2007 MIME types for Apache , Uploading docx files with Paperclip and Rails and Dynamic Word (.docx) Documents in Rails also.

like image 129
Alfred Avatar answered Oct 26 '22 01:10

Alfred


It turns out, as Marc B first hinted at - that all Office documents that end in x are indeed zipped XML files. Anything that uses normal mimetypes will assume that it's a zipped file.

To get around this, you have to register the Office mimetypes with your server. So, for your .pptx files, you put

Mime::Type.register "application/vnd.openxmlformats-officedocument.presentationml.presentation", :pptx

in your config/initializers/mime_types.rb file.

Alternatively, you can use the Rack::Mime::MIME_TYPES.merge!() method, which is seen in action in this Stackoverflow answer, if you have to support all of the Office 2007 files.

like image 44
Makoto Avatar answered Oct 26 '22 01:10

Makoto


The 'x' versions of the Office formats ARE zip files - zipped xml. As such, anything that determines file extensions based on mime types will always see them as zip files.

like image 4
Marc B Avatar answered Oct 26 '22 02:10

Marc B