I'm going to build large files server, and need stack-overflow community advice for file system choice (linux).
File server is going to serve 1-2GB sized static files (mostly different with every request) via Nginx, under constant moderate write to the disks (RAID5 SATA/7200 disks massive). Write to read ratio is about 1:5-10, for every 1 byte written per second, 5-10 are read. Most important for me is read performance, I can live with slower writes.
What Linux file system would be the best solution for this task? And why :) Thanks!
To provide best results with serve heavy content, there is something else to tune. Please take a look at Nginx core developer's comment below:
Switch off sendfile, it works bad on such workloads under linux due to no ability to control readahead (and hence blocks read from disk).
sendfile off;
Use large output buffers
output_buffers 1 512k
Try using aio to ensure better disk concurrency (and note under linux it needs directio as well), i.e. something like this
aio on; directio 512;
Other recommendations:
Check the filesystem swap is not used
Filesystem - ext4, xfs. Good to enable data_writeback and noatime mount options
I achieved 80MB/s of "random read" performance per "real" disk (spindle). Here are my findings.
So, first decide how much traffic you need to push down to users and how much storage you need per server.
You may skip the disk setup advice given below since you already have a RAID5 setup.
Lets take an example of a dedicated 1Gbps bandwidth server with 3 * 2TB disks. Keep first disk dedicated to OS and tmp. For other 2 disks you may create a software raid (For me, it worked better than on-board hardware raid). Else, you need to divide your files equally on independent disks. Idea is to keep both disk share read/write load equally. Software raid-0 is best option.
Nginx Conf There are two ways to achieve high level of performance using nginx.
use directio
aio on;
directio 512;
output_buffers 1 8m;
"This option will require you to have good amount of ram" Around 12-16GB of ram is needed.
userland io
output_buffers 1 2m;
"make sure you have set readahead to 4-6MB for software raid mount" blockdev --setra 4096 /dev/md0 (or independent disk mount)
This setting will optimally use system file cache, and require much less ram. Around 8GB of ram is needed.
Common Notes:
you may also like to use bandwidth throttle to enable 100s of connections over available bandwidth. Each download connection will use 4MB of active ram.
limit_rate_after 2m;
limit_rate 100k;
Both of above solution will scale easily to 1k+ simultaneous user on a 3 disk server. Assuming you have 1Gbps bandwidth and each connection is throttled at 1Mb/ps There is additional setup needed to optimize disk writes without affecting reads much.
make all Uploads to main os disk on a mount say /tmpuploads. this will ensure no intermittent disturbance while heavy reads are going on. Then move the file from /tmpuploads using "dd " command with oflag=direct. something like
dd if=/tmpuploads/<myfile> of=/raidmount/uploads/<myfile> oflag=direct bs=8196k
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With