Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I compare two files in golang?

With Python I can do the next:

equals = filecmp.cmp(file_old, file_new)

Is there any builtin function to do that in go language? I googled it but without success.

I could use some hash function in hash/crc32 package, but that is more work that the above Python code.

like image 423
rvillablanca Avatar asked Apr 08 '15 02:04

rvillablanca


People also ask

How do you compare in go?

Using comparison operators: Go strings support comparison operators, i.e, ==, != , >=, <=, <, >. Here, the == and != operator are used to check if the given strings are equal or not, and >=, <=, <, > operators are used to find the lexical order.

How do I compare two files in code?

Hold the Ctrl key (if you're using Windows) or the Command key (if you're on a Mac) and select the two files you want to compare with your mouse, right-click, then select “Compare Selected” from the drop-down menu.

Can we compare two files?

Comparing files is possible in edit source code editors, Microsoft Office tools, and even between two file directories. You can typically compare files in Windows, Mac, and Linux operating systems in more than one way. We'll guide you through the most efficient and practical solutions.


2 Answers

To complete the @captncraig answer, if you want to know if the two files are the same, you can use the SameFile(fi1, fi2 FileInfo) method from the OS package.

SameFile reports whether fi1 and fi2 describe the same file. For example, on Unix this means that the device and inode fields of the two underlying structures are identical;

Otherwise, if you want to check the files contents, here is a solution which checks the two files line by line avoiding the load of the entire files in memory.

First try: https://play.golang.org/p/NlQZRrW1dT


EDIT: Read by bytes chunks and fail fast if the files have not the same size. https://play.golang.org/p/YyYWuCRJXV

const chunkSize = 64000

func deepCompare(file1, file2 string) bool {
    // Check file size ...

    f1, err := os.Open(file1)
    if err != nil {
        log.Fatal(err)
    }
    defer f1.Close()

    f2, err := os.Open(file2)
    if err != nil {
        log.Fatal(err)
    }
    defer f2.Close()

    for {
        b1 := make([]byte, chunkSize)
        _, err1 := f1.Read(b1)

        b2 := make([]byte, chunkSize)
        _, err2 := f2.Read(b2)

        if err1 != nil || err2 != nil {
            if err1 == io.EOF && err2 == io.EOF {
                return true
            } else if err1 == io.EOF || err2 == io.EOF {
                return false
            } else {
                log.Fatal(err1, err2)
            }
        }

        if !bytes.Equal(b1, b2) {
            return false
        }
    }
}
like image 191
Pith Avatar answered Oct 16 '22 19:10

Pith


I am not sure that function does what you think it does. From the docs,

Unless shallow is given and is false, files with identical os.stat() signatures are taken to be equal.

Your call is comparing only the signature of os.stat, which only includes:

  1. File mode
  2. Modified Time
  3. Size

You can learn all three of these things in Go from the os.Stat function. This really would only indicate that they are literally the same file, or symlinks to the same file, or a copy of that file.

If you want to go deeper you can open both files and compare them (python version reads 8k at a time).

You could use an crc or md5 to hash both files, but if there are differences at the beginning of a long file, you want to stop early. I would recommend reading some number of bytes at a time from each reader and comparing with bytes.Compare.

like image 11
captncraig Avatar answered Oct 16 '22 19:10

captncraig