What are the top issues and in which order of importance to look into while optimizing (performance tuning, troubleshooting) an existing (but unknown to you) database?
Which actions/measures in your previous optimizations gave the most effect (with possibly the minimum of work) ?
I'd like to partition this question into following categories (in order of interest to me):
Update:
Suppose I have a copy of a database on dev machine without access to production environment to observe stats, most used queries, performance counters, etc. in real use.
This is development-related but not DBA-related question.
Update2:
Suppose the database was developed by others and was given to me for optimization (review) before it was delivered to production.
It is quite usual to have outsourced development detached from end-users.
Besides, there is a database design paradigm that a database, in contrast to application data storage, should be a value in itself independently on specific applications that use it or on context of its use.
Update3: Thanks to all answerers! You all pushed me to open subquestion
How do you stress load dev database (server) locally?
Create a performance Baseline (non-intrusive, use performance counters)
Identify the most expensive queries (non-intrusive, use SQL Profiler)
Identify the most frequently run queries (non-intrusive, use SQL Profiler)
Identify any overly complex queries, or those using slowly performing constructs or patterns. (non-intrusive to identify, use SQL Profiler and/or code inspections; possibly intrusive if changed, may require substantial re-testing)
Assess your hardware
Identify Indexes that would benefit the measured workload (non-intrusive, use SQL Profiler)
Measure and compare to your baseline.
If you have very large databases, or extreme operating conditions (such as 24/7 or ultra high query loads), look at the high end features offered by your RDBMS, such as table/index partitioning.
This may be of interest: How Can I Log and Find the Most Expensive Queries?
If the database is unknown to you, and you're under pressure, then you may not have time for Mitch's checklist which is good best practice to monitor server health.
You also need access to production to gather real info from assorted queries you can run. Without this, you're doomed. The server load pattern is important: you can't reproduce many issue yourself on a development server because you won't use the system like an end user.
Also, focus on "biggest bang for the buck". An expensive query running once daily at 3am can be ignored. A not-so-expensive one running every second is well worth optimising. However, you may not know this without knowing server load pattern.
So, basic steps..
Assuming you're firefighting:
Slow response:
Things you should have:
Edit: After your update
Static analysis is best practices only: you can't optimise for usage. This is all you can do. This is marc_s' answer.
You can guess what the most common query may be, but you can't guess how much data will be written or how badly a query scales with more data
In many shops developers provide some support, either directly or as *3rd line"
If you've been given a DB for review by another team that you hand over to another team to deploy: that's odd.
If you're not interested in the runtime behavior of the database, e.g. what are the most frequently executed queries and those that consume the most time, you can only do a "static" analysis of the database structure itself. That has a lot less value, really, since you can only check for a number of key indicators of bad design - but you cannot really tell much about the "dynamics" of the system being used.
Things I would check for in a database that I get as a .bak
file - without the ability to collect live and actual runtime performance statistics - would be:
normalization - is the table structure normalized to third normal form? (at least most of the time - there might be some exceptions)
do all tables have a primary key? ("if it doesn't have a primary key, it's not a table", after all)
For SQL Server: do all the tables have a good clustering index? A unique, narrow, static, and preferably ever-increasing clustered key - ideally an INT IDENTITY, and most definitely not a large compound index of many fields, no GUID's and no large VARCHAR fields (see Kimberly Tripp's excellent blog posts on the topics for details)
are there any check and default constraints on the database tables?
are all the foreign key fields backed up by a non-clustered index to speed up JOIN queries?
are there any other, obvious "deadly sins" in the database, e.g. overly complicated views, or really badly designed tables etc.
But again: without actual runtime statistics, you're quite limited in what you can do from a "static analysis" point of view. The real optimization can only really happen when you have a workload from a regular day of operation, to see what queries are used frequently and put the most stress on your database --> use Mitch's checklist to check those points.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With