I have a m1.small instance in amazon with 8GB hard disk space on which my rails application runs. It runs smoothly for 2 weeks and after that it crashes saying the memory is full. App is running on rails 3.1.1, unicorn and nginx
I simply dont understand what is taking 13G ?
I killed unicorn and 'free' command is showing some free space while df is still saying 100%
I rebooted the instance and everything started working fine.  
             total       used       free     shared    buffers     cached  
Mem:       1705192    1671580      33612          0     321816     405288  
-/+ buffers/cache:     944476     760716   
Swap:       917500      50812     866688 
Filesystem           1K-blocks      Used Available Use% Mounted on  
/dev/xvda1             8256952   7837520         4 100% /  
none                    847464       120    847344   1% /dev  
none                    852596         0    852596   0% /dev/shm  
none                    852596        56    852540   1% /var/run  
none                    852596         0    852596   0% /var/lock  
/dev/xvda2           153899044    192068 145889352   1% /mnt  
/dev/xvdf             51606140  10276704  38707996  21% /data  
28K ./root  
6.6M    ./etc  
4.0K    ./opt  
9.7G    ./data  
1.7G    ./usr  
4.0K    ./media  
du: cannot access `./proc/27220/task/27220/fd/4': No such file or directory  
du: cannot access `./proc/27220/task/27220/fdinfo/4': No such file or directory  
du: cannot access `./proc/27220/fd/4': No such file or directory  
du: cannot access `./proc/27220/fdinfo/4': No such file or directory  
0   ./proc  
14M ./boot  
120K    ./dev  
1.1G    ./home  
66M ./lib  
4.0K    ./selinux  
6.5M    ./sbin  
6.5M    ./bin  
4.0K    ./srv  
148K    ./tmp  
16K ./lost+found  
20K ./mnt  
0   ./sys  
253M    ./var  
13G .  
13G total   
             total       used       free     shared    buffers     cached    
Mem:       1705192     985876     **719316**          0     365536     228576    
-/+ buffers/cache:     391764    1313428    
Swap:       917500      46176     871324  
Filesystem           1K-blocks      Used Available Use% Mounted on  
/dev/xvda1             8256952   7837516         8 100% /  
none                    847464       120    847344   1% /dev  
none                    852596         0    852596   0% /dev/shm  
none                    852596        56    852540   1% /var/run  
none                    852596         0    852596   0% /var/lock  
/dev/xvda2           153899044    192068 145889352   1% /mnt  
/dev/xvdf             51606140  10276704  38707996  21% /data  
rails_env = 'production'  
working_directory "/home/user/app_name"  
worker_processes 5  
preload_app true  
timeout 60  
rails_root = "/home/user/app_name"  
listen "#{rails_root}/tmp/sockets/unicorn.sock", :backlog => 2048  
# listen 3000, :tcp_nopush => false  
pid "#{rails_root}/tmp/pids/unicorn.pid"  
stderr_path "#{rails_root}/log/unicorn/unicorn.err.log"  
stdout_path "#{rails_root}/log/unicorn/unicorn.out.log"  
GC.copy_on_write_friendly = true if GC.respond_to?(:copy_on_write_friendly=)  
before_fork do |server, worker|  
  ActiveRecord::Base.connection.disconnect!  
  ##  
  # When sent a USR2, Unicorn will suffix its pidfile with .oldbin and  
  # immediately start loading up a new version of itself (loaded with a new  
  # version of our app). When this new Unicorn is completely loaded  
  # it will begin spawning workers. The first worker spawned will check to  
  # see if an .oldbin pidfile exists. If so, this means we've just booted up  
  # a new Unicorn and need to tell the old one that it can now die. To do so  
  # we send it a QUIT.  
  #  
  # Using this method we get 0 downtime deploys.  
  old_pid = "#{rails_root}/tmp/pids/unicorn.pid.oldbin"  
  if File.exists?(old_pid) && server.pid != old_pid  
    begin  
      Process.kill("QUIT", File.read(old_pid).to_i)  
    rescue Errno::ENOENT, Errno::ESRCH  
      # someone else did our job for us  
    end  
  end  
end  
after_fork do |server, worker|  
  ActiveRecord::Base.establish_connection  
  worker.user('rails', 'rails') if Process.euid == 0 && rails_env == 'production'  
end  
i've just released 'unicorn-worker-killer' gem. This enables you to kill Unicorn worker based on 1) Max number of requests and 2) Process memory size (RSS), without affecting the request.
It's really easy to use. No external tool is required. At first, please add this line to your Gemfile.
gem 'unicorn-worker-killer'
Then, please add the following lines to your config.ru.
# Unicorn self-process killer
require 'unicorn/worker_killer'
# Max requests per worker
use Unicorn::WorkerKiller::MaxRequests, 10240 + Random.rand(10240)
# Max memory size (RSS) per worker
use Unicorn::WorkerKiller::Oom, (96 + Random.rand(32)) * 1024**2
It's highly recommended to randomize the threshold to avoid killing all workers at once.
As Preston mentioned you don't have a memory problem (over 40% free), you have a disk full problem. du reports most of the storage is consumed in /root/data.
You could use find to identify very large files, eg, the following will show all files under that dir greater than 100MB in size.
sudo find /root/data -size +100M
If unicorn is still running, lsof (LiSt Open Files) can show what files are in use by your running programs or by a specific set of processes (-p PID), eg:
sudo lsof | awk  '$5 ~/REG/ && $7 > 100000000 { print }'
will show you open files greater than 100MB in size
I think you are conflating memory usage and disk space usage. It looks like Unicorn and its children were using around 500 MB of memory, you look at the second "-/+ buffers/cache:" number to see the real free memory. As far as the disk space goes, my bet goes on some sort of log file or something like that going nuts. You should do a du -h in the data directory to find out what exactly is using so much storage. As a final suggestion, it's a little known fact that Ruby never returns memory back to the OS if it allocates it. It DOES still use it internally, but once Ruby grabs some memory the only way to get it to yield the unused memory back to the OS is to quit the process. For example, if you happen to have a process that spikes your memory usage to 500 MB, you won't be able to use that 500 MB again, even after the request has completed and the GC cycle has run. However, Ruby will reuse that allocated memory for future requests, so it is unlikely to grow further.
Finally, Sergei mentions God to monitor the process memory. If you are interested in using this, there is already a good config file here. Be sure to read the associated article as there are key things in the unicorn config file that this god config assumes you have.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With