Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to move away from guid fields as primary keys

I am in a big mess and looking to see if I can get out of it. I have a database with all primary keys on tables defined on uniqueidentifier columns. This was kind of forced on us, giving 'security' as one of the reasons. The other I guess comparatively worthier reason is that some of the tables participate in replication.

I am reviewing the database and I feel one easily avoidable future performance bottleneck is to may add auto-increment bigint columns in all tables and make them primary keys (clustered). And somehow 'attach' the pk-fk relationships properly. But still retain the old columns for any future use.

Any suggestions/comments/donots in this regard? Ours is a c#/MSSQL Server R2/Linq environment.

EDIT: Looking at the comments, I realize I left out a few important details. All primary key Guid fields are Clustered and no I am not using newsequentialId (We use Linq to SQL, Primary keys are generated client side. With replication involved, we weren't sure of a way to correctly generate sequential ids from different client environments without conflicts).

My 'feeling' is due to the known fact that clustered index on a guid column will cause high degree of fragmentation and will only worsen the situation as the database grows.

Also, I am not really trying optimize right now, but trying to correct a bad design to avoid future headaches when database gets too large. Wanted to know if it is worth doing it now.

Helpful discussion also related to the issue here in this post, and another

like image 912
Brian Avatar asked Dec 09 '22 13:12

Brian


2 Answers

Performance tuning a database is actually simple, but it's hard. First, you need to gather the list of statements that are being executed against the database by running a long-running profile over the course of at least one business day, but ideally two.

Save that profile to the database so it can be queried, and thus you can easily find DISTINCT queries that are executed against your database.

After determining the ones that execute the most, analyze their execution plans, it's likely it has nothing to do with the GUID's and everything to do with either the queries themselves (i.e. they're just terrible) or you need a different index.

Things to watch out for:

  1. Views that are heavily filtered with a WHERE clause. These are fantastic candidates for either Stored Procedures or Parameterized Views.
  2. Statements that JOIN to very large tables, those at times can be good candidates for sub-queries instead. This does depend on the execution plan.
  3. Statements that appear to be executed multiple times. This is often a good sign that the application itself just isn't doing a good job managing how often it makes round-trips to the server. I've seen applications that will run the same query 10+ times.
like image 82
Mike Perrenoud Avatar answered Dec 21 '22 19:12

Mike Perrenoud


A stopgap measure could be to use NEWSEQUENTIALID() instead of NEWID(). At least this way you won't get as much fragmentation.

Creates a GUID that is greater than any GUID previously generated by this function on a specified computer since Windows was started. After restarting Windows, the GUID can start again from a lower range, but is still globally unique. When a GUID column is used as a row identifier, using NEWSEQUENTIALID can be faster than using the NEWID function. This is because the NEWID function causes random activity and uses fewer cached data pages. Using NEWSEQUENTIALID also helps to completely fill the data and index pages.

http://technet.microsoft.com/en-us/library/ms189786.aspx

Note that this won't necessarily help resolve your performance woes depending on what they are (perhaps you can elaborate - INSERTs? SELECT filtering? etc.).

like image 31
Mark Sowul Avatar answered Dec 21 '22 19:12

Mark Sowul