Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server: ~2000 Heap Tables all using GUID Uniqueidentifier - Possible Clustered Indexing?

I have just taken over a database which has around 2200 tables. Over 2000 of these have no clustered index (some have no indexes at all).

All of the tables have been configured to use a GUID as the uniqueidentifier.

Just looking at the query plans, I can see that there are many table scans occurring. Most searches use the uniqueidentifier to search on.

I am wondering if it is better to have a clustered index on the GUID than not to have a clustered index at all. I imagine that a clustered index on a 16-byte column will inevitably lead to fragmentation.

I could arguably cluster on other columns but the majority of searches tend to search by or join via the GUIDS.

Any advice would be very much welcomed. I've never seen so many GUID's!!

like image 587
Stevie Gray Avatar asked Dec 18 '22 04:12

Stevie Gray


2 Answers

In generally, I would recommend having an identity column as the primary key and use that for clustering. This is also a better choice for joins.

Why? First, identity keys are generally shorter that unique ids. So, foreign key references and indexes are smaller.

More importantly, inserts would always go at the "end" of the table. When using GUIDs, inserts are often going to cause fragmentation. If you are inserting rows, I would say that a secondary index on the GUID might be better than a clustered index (the fragmentation is only in the index).

With 2000 tables, I doubt you will change the structure. You can ameliorate the fragmentation using newsequentialid().

like image 135
Gordon Linoff Avatar answered Dec 28 '22 06:12

Gordon Linoff


GUID column with random values usually is not the best choice for a clustered index because it could be the root cause of an index fragmentation:

  1. Read ahead opportunity of the database won't be effective;
  2. The cost of insert operations will be too expensive, because in this case you'll got lots of page split overhead;

There are 3 ways how you can live with that:

  1. Schedule planning index reorganizing and rebuilding which will reduce index fragmentation and improve your statistics automatically;
  2. Use newsequantialid for generating values of this column;
  3. Generate GUID value sequantialy outside of the database (Guid.Comb Identifier is a great example of solving this issue in NHibernate).
like image 30
Maxim Zhukov Avatar answered Dec 28 '22 07:12

Maxim Zhukov