Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is GridFS faster than usual FS?

Tags:

mongodb

gridfs

I wonder whether storing all the uploaded files in GridFS is faster than storing them on the usual filesystem, e.g. Ext4 (in terms of reading/writing speed and average server load).

like image 584
eigenein Avatar asked Dec 31 '10 11:12

eigenein


People also ask

Is GridFS fast?

Yes, gridfs is fast and reliable enough to be used for production.

What is GridFS and usage?

GridFS is the MongoDB specification for storing and retrieving large files such as images, audio files, video files, etc. It is kind of a file system to store files but its data is stored within MongoDB collections. GridFS has the capability to store files even greater than its document size limit of 16MB.

What is the default size of GridFS chunk?

By default, GridFS uses a default chunk size of 255 kB; that is, GridFS divides a file into chunks of 255 kB with the exception of the last chunk. The last chunk is only as large as necessary.

What is MD5 in GridFS?

A kind of safe mode is built into the GridFS specification. When you save a file, and MD5 hash is created on the server. If you save the file in safe mode, an MD5 will be created on the client for comparison with the server version. If the two hashes don't match, an exception will be raised.


2 Answers

In general it's slower for usual filesystem access style. But it can benefit from nice MongoDB features:

  • You can associate any metadata with the files and query it in a usual manner. Actually files are stored as regular Mongo documents in fs.files and fs.chunks collections.
  • Replication. With a replica set you will get an (almost) instant backup, failover and read scalability (read request can go to slave nodes).
  • Sharding. Like any other collection it's possible to distribute files across multiple Mongo instances with auto-sharding. This will improve write scalability.
like image 51
pingw33n Avatar answered Oct 02 '22 15:10

pingw33n


When to use GridFS

  • If your filesystem limits the number of files in a directory, you can use GridFS to store as many files as needed.
  • When you want to keep your files and metadata automatically synced and deployed across a number of systems and facilities. When using geographically distributed replica sets MongoDB can distribute files and their metadata automatically to a number of mongod instances and facilitates.
  • When you want to access information from portions of large files without having to load whole files into memory, you can use GridFS to recall sections of files without reading the entire file into memory.
like image 41
firefly2442 Avatar answered Oct 02 '22 17:10

firefly2442