Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Locally calculate dropbox hash of files

Dropbox rest api, in function metatada has a parameter named "hash" https://www.dropbox.com/developers/reference/api#metadata

Can I calculate this hash locally without call any remote api rest function?

I need know this value to reduce upload bandwidth.

like image 447
Victor Sanchez Avatar asked Oct 22 '12 09:10

Victor Sanchez


People also ask

What hash does Dropbox use?

We rely on bcrypt as our core hashing algorithm with a per-user salt and an encryption key (or global pepper), stored separately.

How do I find the hash value of a file?

In Windows File Explorer select the files you want the hash values calculated for, click the right mouse button, and select Calculate Hash Value, then select the appropriate hash type from the pop-up sub-menu (e.g. MD5). The values will then be calculated and displayed.

Which tool can be used to calculate hash value?

Description. Hash Tool is a utility to calculate the hash of multiple files. A file hash can be said to be the 'signature' of a file and is used in many applications, including checking the integrity of downloaded files. This compact application helps you quickly and easily list the hashes of your files.

What is file content hash?

A hash value is a unique value that corresponds to the content of the file. Rather than identifying the contents of a file by its file name, extension, or other designation, a hash assigns a unique value to the contents of a file.


2 Answers

https://www.dropbox.com/developers/reference/content-hash explains how Dropbox computes their file hashes. A Python implementation of this is below:

import hashlib
import math
import os

DROPBOX_HASH_CHUNK_SIZE = 4*1024*1024

def compute_dropbox_hash(filename):
    file_size = os.stat(filename).st_size
    with open(filename, 'rb') as f:
        block_hashes = b''
        while True:
            chunk = f.read(DROPBOX_HASH_CHUNK_SIZE)
            if not chunk:
                break
            block_hashes += hashlib.sha256(chunk).digest()
        return hashlib.sha256(block_hashes).hexdigest()
like image 149
SMX Avatar answered Jan 02 '23 18:01

SMX


The "hash" parameter on the metadata call isn't actually the hash of the file, but a hash of the metadata. It's purpose is to save you having to re-download the metadata in your request if it hasn't changed by supplying it during the metadata request. It is not intended to be used as a file hash.

Unfortunately I don't see any way via the Dropbox API to get a hash of the file itself. I think your best bet for reducing your upload bandwidth would be to keep track of the hash's of your files locally and detect if they have changed when determining whether to upload them. Depending on your system you also likely want to keep track of the "rev" (revision) value returned on the metadata request so you can tell whether the version on Dropbox itself has changed.

like image 35
Ben Zittlau Avatar answered Jan 02 '23 17:01

Ben Zittlau