Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What to monitor on SQL Server

I have been asked to monitor SQL Server (2005 & 2008) and am wondering what are good metrics to look at? I can access WMI counters but am slightly lost as to how much depth is going to be useful.

Currently I have on my list:

  • user connections
  • logins per second
  • latch waits per second
  • total latch wait time
  • dead locks per second
  • errors per second
  • Log and data file sizes

I am looking to be able to monitor values that will indicate a degradation of performance on the machine or a potential serious issue. To this end I am also wondering at what values some of these things would be considered normal vs problematic?

As I reckon it would probably be a really good question to have answered for the general community I thought I'd court some of you DBA experts out there (I am certainly not one of them!)

Apologies if a rather open ended question. Ry

like image 243
rjshuttleworth Avatar asked Jun 08 '10 13:06

rjshuttleworth


2 Answers

I would also monitor page life expectancy and your buffer cache hit ratio, see Use sys.dm_os_performance_counters to get your Buffer cache hit ratio and Page life expectancy counters for details

like image 102
SQLMenace Avatar answered Sep 22 '22 18:09

SQLMenace


Late answer but can be of interest to other readers

One of my colleagues had the similar problem, and used this thread to help get him started. He also ran into a blog post describing common causes of performance issues and an instruction on what metrics should be monitored, beside ones already mentioned here. These other metrics are:

• %Disk Time:

This counter indicates a disk problem, but must be observed in conjunction with the Current Disk Queue Length counter to be truly informative. Recall also that the disk could be a bottleneck prior to the %Disk Time reaching 100%.

• %Disk Read Time and the %Disk Write Time:

The %Disk Read Time and %Disk Write Time metrics are similar to %Disk Time, just showing the operations read from or written to disk, respectively. They are actually the Average Disk Read Queue Length and Average Disk Write Queue Length values presented in percentages.

• %Idle Time:

Measures the percentage of time the disk was idle during the sample interval. If this counter falls below 20 percent, the disk system is saturated. You may consider replacing the current disk system with a faster disk system.

• %Free Space:

Measures the percentage of free space on the selected logical disk drive. Take note if this falls below 15 percent, as you risk running out of free space for the OS to store critical files. One obvious solution here is to add more disk space.

If you would like to read the whole post, you may find it here: http://www.sqlshack.com/sql-server-disk-performance-metrics-part-2-important-disk-performance-measures/

like image 38
Jessica Avatar answered Sep 22 '22 18:09

Jessica