Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mime type checking of files uploaded Golang

Tags:

mime-types

go

I am trying to get the mime type of files being uploaded in my server.

The .xlsx and .docx files mime type comes up application/zip. I tried to unzip the file and read the file of type "_rels/.rels". The doubt that I have is while reading this particular file, what should the maximum size that I should leave for the reading the file, and if the Target is "xl/workbook.xml" can I assume it to be of type xlsx?

My code is as below

 file, fileHeader, err := r.FormFile("file")

buffer := make([]byte, 512)
_, err = file.Read(buffer)
if err != nil {
    fmt.Println(err)
}

contentType := http.DetectContentType(buffer)
if contentType == "application/zip" {
    r, err := zip.NewReader(file, fileHeader.Size)
    if err != nil {
        fmt.Println(err)
    }
    for _, zf := range r.File {
        if zf.Name == "_rels/.rels" {
            fmt.Println("rels")
            rc, err := zf.Open()
            if err != nil {
                fmt.Println("Rels errors")
            }
            const BufferSize = 1000
            buffer := make([]byte, BufferSize)
            defer rc.Close()
            bytesread, err := rc.Read(buffer)
            if err != nil {
                if err != io.EOF {
                    fmt.Println(err)
                }
            }

            fmt.Println("bytes read: ", bytesread)
            fmt.Println("bytestream to string: ", string(buffer[:bytesread]))
            fmt.Println(rc)
        }
    }
}


var arr []byte
w.Header().Set("Content-Type", "application/json")
w.Write(arr)

}

the output I get is

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships     xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/><Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="xl/workbook.xml"/></Relationships>

Any tips on how to read a .doc or .xls ?

like image 687
Rahul Ganguly Avatar asked Jul 06 '18 11:07

Rahul Ganguly


People also ask

What is MIME type in file upload?

Two primary MIME types are important for the role of default types: text/plain is the default value for textual files. A textual file should be human-readable and must not contain binary data. application/octet-stream is the default value for all other cases.

How do I see file extensions in Golang?

Given a path, we have to find the file name extension used by the path in Golang. In the Go programming language, to get the file name extension used by the given path – we use the Ext() function of path/filepath package. The Ext() function returns the file name extension used by the given path.

Is MIME type same as file type?

Slightly longer answer: Mime types and file extensions provide hints to how to deal with a file. Whereas file extensions are commonly used for your OS to decide what program to open a file with, Mime types are used by your browser to decide how to present some data (or the server on how to interpret received data).


1 Answers

Unfortunately DetectContentType from the html package is rather limited to the mime types it can detect.

As for detecting binary formats, you don't need to read the whole file if all you need is to tell if it is a .doc. You can just check the file signature. A good resource for file signatures is file signatures

If you instead want to use existing packages, this is a summary of what's on github.

Disclaimer: I'm the author of mimetype.

  • filetype

    • pure go, no c bindings
    • can be extented to detect new mime types
    • has issues with files which pass as more than one mime type (ex: xlsx and docx passing as zip) because it stores matching functions in a map, thus it does not guarantee the order of traversal
  • magicmime

    • needs libmagic-dev installed
    • can be extended, albeit harder... man magic
  • mimetype

    • pure go, no c bindings
    • higher number of detected mime types than filetype
    • is thread safe
    • can be extended
like image 140
GabrielVasile Avatar answered Oct 10 '22 21:10

GabrielVasile