When running any kind of server under load there are several resources that one would like to monitor to make sure that the server is healthy. This is specifically true when testing the system under load. Some examples for this would be CPU utilization, memory usage, and perhaps disk space. What other resource should I be monitoring, and what tools are available to do so?

As many as you can afford to, and can then graph/understand/look at the results. Monitoring resources is useful for not only capacity planning, but anomaly detection, and anomaly detection significantly helps your ability to detect security events. You have a decent start with your basic graphs. I'd want to also monitor the number of threads, number of connections, network I/O, disk I/O, page faults (arguably this is related to memory usage), context switches. I really like munin for graphing things related to hosts.

I use Zabbix extensively in production, which comes with a stack of useful defaults. Some examples of the sorts of things we've configured it to monitor: <ul> <li>Network usage</li> <li>CPU usage (% user,system,nice times)</li> <li>Load averages (1m, 5m, 15m)</li> <li>RAM usage (real, swap, shm)</li> <li>Disc throughput</li> <li>Active connections (by port number)</li> <li>Number of processes (by process type)</li> <li>Ping time from remote location</li> <li>Time to SSL certificate expiry</li> <li>MySQL internals (query cache usage, num temporary tables in RAM and on disc, etc)</li> </ul> Anything you can monitor with Zabbix, you can also attach triggers to - so it can restart failed services; or page you to alert about problems. Collect the data now, before performance becomes an issue. When it does, you'll be glad of the historical baselines, and the fact you'll be able to show what date and time problems started happening for when you need to hunt down and punish exactly which developer made bad changes :)

Which resources should one monitor on a Linux server running a web-server or database

Tags:

performance

linux

sysadmin

When running any kind of server under load there are several resources that one would like to monitor to make sure that the server is healthy. This is specifically true when testing the system under load.

Some examples for this would be CPU utilization, memory usage, and perhaps disk space. What other resource should I be monitoring, and what tools are available to do so?

889

asked Sep 16 '08 17:09

oneself

2 Answers

As many as you can afford to, and can then graph/understand/look at the results. Monitoring resources is useful for not only capacity planning, but anomaly detection, and anomaly detection significantly helps your ability to detect security events.

You have a decent start with your basic graphs. I'd want to also monitor the number of threads, number of connections, network I/O, disk I/O, page faults (arguably this is related to memory usage), context switches.

I really like munin for graphing things related to hosts.

answered Oct 14 '22 01:10

Daniel Papasian

I use Zabbix extensively in production, which comes with a stack of useful defaults. Some examples of the sorts of things we've configured it to monitor:

Network usage
CPU usage (% user,system,nice times)
Load averages (1m, 5m, 15m)
RAM usage (real, swap, shm)
Disc throughput
Active connections (by port number)
Number of processes (by process type)
Ping time from remote location
Time to SSL certificate expiry
MySQL internals (query cache usage, num temporary tables in RAM and on disc, etc)

Anything you can monitor with Zabbix, you can also attach triggers to - so it can restart failed services; or page you to alert about problems.

Collect the data now, before performance becomes an issue. When it does, you'll be glad of the historical baselines, and the fact you'll be able to show what date and time problems started happening for when you need to hunt down and punish exactly which developer made bad changes :)

answered Oct 14 '22 00:10

Jon Topper

Related questions
                            
                                LD_PRELOAD doesn't affect dlopen() with RTLD_NOW
                            
                                Linux - /usr/local or /var for application data?
                            
                                Python at Synology, how to get Python3 modules installed and where is Python2.7 installed?
                            
                                How to find a base architecture inside Makefile?
                            
                                Why do processes I fork get systemd as their parent?
                            
                                How to Simply Remove Duplicate Frames from a Video using ffmpeg
                            
                                How to instal/setup XMLStarlet in Linux?
                            
                                get memory usage per process with sar, sysstat
                            
                                Alsa: how to duplicate a stream on 2 outputs and save system configs?
                            
                                Exclude list of file extensions from find in bash shell
                            
                                Linux Raw Socket Permissions Issue
                            
                                Why should we check WIFEXITED after wait in order to kill child processes in Linux system call?
                            
                                How to check java version at linux (RedHat6)
                            
                                Pre-authentication failed: Password read interrupted while getting initial credentials [closed]
                            
                                bash array using @ vs *, difference between the two
                            
                                how to force stop Intellij on linux
                            
                                How to empty an array in bash script
                            
                                VirtualBox Screen Resolution Too Small During Installation
                            
                                Simplest way to build dotnet SDK project requiring net461 on MacOS
                            
                                Docker buildx with node app on Apple M1 Silicon - standard_init_linux.go:211: exec user process caused "exec format error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With