Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HDFS vs GridFS: When to use which?

HDFS and GridFS are two great technologies for distributed file saving but what are their differences? What type of problems fit better to each?

like image 368
iCode Avatar asked Jan 31 '12 09:01

iCode


1 Answers

HDFS intended for batch processing (you're know, when you running a query that will read many of your files one-by-one), but really suck when you doing random access operations and it is pain in the neck to maintain it or even deploy (you're know, all of these Zookepers, Namenodes and so on). On the other hand GridFS is slower at batches, but not in the case when you do a lot of random accesses, but have a bigger storage overhead compared to HDFS.

I would say that you should use HDFS for analitycs and GridFS for backing web-site.

like image 103
om-nom-nom Avatar answered Sep 21 '22 15:09

om-nom-nom