Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Store large amount of images on multiple servers

I would like to know what is the best solution for storing large amount of images on multiple servers like google, facebook.

It seems that storing in filesystem is better then inside a database but what about using a noSQL DB like cassandra.

Do Google/Facebooke store the same image in multiple servers for the load balancing. How does it work? What is the best solution?

Thx a lot

like image 676
Naster Avatar asked Mar 25 '12 02:03

Naster


People also ask

How do I store a large amount of photos?

Several cloud storage services offer free storage. Dropbox is excellent, but the storage capacity is limited to 2 GB. Google Drive is another great cloud storage service with 15 GB for free. There are also specific cloud photo services, such as Google Photos.

How would you most efficiently store large images in a database?

The large images should be stored in something like AWS S3, HDFS, a Content Delivery Network (CDN), a web server, file server or whatever else would be great a serving up large static objects, in accordance with your use case and budget.

Is it better to store images in database or filesystem?

Generally databases are best for data and the file system is best for files. It depends what you're planning to do with the image though. If you're storing images for a web page then it's best to store them as a file on the server. The web server will very quickly find an image file and send it to a visitor.


1 Answers

There's nothing wrong with the approach you're taking. As mentioned, there are caveats, however, the possibilities do exist, and a lot of people and companies are successfully storing files in Apache Cassandra.

  • zjffdu/cassandra-fs is the first solution i'd look into. Now, this was last developed 2 years ago, so I'd be a bit cautious on it working the first time, out of the box. Apache Cassandra is now at version 1.0.x, with 1.1.x on the way. 2 years ago, that was version 0.6.x maybe? A lot has changed & improved in 24 months.
  • semantico/cassandra-fs a fork ... last touched 7 months ago
  • favoritas37/cassandra-fs another fork ... last touched 3 months ago and indicates compatibility with the 1.0.5 branch of Cassandra

The principal behind this is to take a file, break it into a set of chunks and store those chunks as columns in a row. When retrieving, pull each column, reassemble the file and voila.

Cassandra FAQ: large file and blog storage

...files of around 64Mb and smaller can be easily stored in the database without splitting them into smaller chunks...

Lucene indexes in Cassandra

...its files are broken down into blocks (whose sizes are capped), where each block (see FileBlock) is stored as the value of a column in the corresponding row...

You'll get more positive feedback on the Cassandra mailing list and on the IRC channel.

Finally, this is from 2009, and written by folks at Facebook, which should go some way to help answer more of the fundamental questions you have: Cassandra - A Decentralized Structured Storage System.

like image 169
sdolgy Avatar answered Sep 18 '22 13:09

sdolgy