Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using hash functions for file storage?

A common technique for storing a lot of files/blobs in a filesystem is to use a hash function to determine the filepath; eg hash(identifier) -> "o238455789" -> o23/8455/789 (there is often a hash-collision strategy too)

Does this technique have a name (is it a 'pattern'?) so that I may find it with a search of ACM Digital Library or similar online database of computing literature.

Are there any books/papers that explore the problem/solution?

PS thanks for the helpful notes - but none address the technique given above.

like image 930
Stephen Avatar asked Dec 04 '08 08:12

Stephen


3 Answers

I think this is what microsoft has done in SQL Server 2008 with FILESTREAM storage. It allows storage of BLOB data inside of SQL Server, but allows you to access the files directly off the disk, which gives you kick-ass performance.

Microsoft released a whitepaper on managing unstructured data that you may be interested in. THere's also an MSDN article describing FILESTREAM as well as the pros & cons of file storage & whether to BLOB or not to BLOB

like image 80
Nick Kavadias Avatar answered Nov 17 '22 17:11

Nick Kavadias


United States Patent 5742807 deals with this
http://www.freepatentsonline.com/5742807.html

Systems and methods for managing a plurality of electronically stored documents in an open document repository employ a one-way hash function to compute a hash for the stored documents as an indexing link. A document management index maps an attribute of an original document stored in the repository to the hash and the document. A hash-to-location index maps the hash to an address location of the document in a file system of the repository. The attribute points to the hash which then points to the location for linking the attribute to the location.

like image 24
clyfe Avatar answered Nov 17 '22 18:11

clyfe


@Chris Kimpton

This would be called indexing. Sharding or partitioning is more about how to split a file.

like image 1
Loki Avatar answered Nov 17 '22 19:11

Loki