Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance value of COMB guids

Jimmy Nilsson discusses his COMB guid concept here. This concept is popular in NHibernate, among other circles, for its supposed performance value over standard GUIDs which are typically far more random.

However, in testing, this does not appear to be the case. Am I missing something?

Test case:

I have a table called temp (not a temp table, just a table named "temp") with 585,000 rows in it. I have a new table called Codes, and wish to copy all 585,000 code values from the temp table to the codes table. The test SQL I executed was:

set statistics time on;

truncate table codes;
DBCC DBREINDEX ('codes', '', 90);

insert into codes (codeid, codevalue)
select newid(), codevalue from temp

truncate table codes;
DBCC DBREINDEX ('codes', '', 90);

insert into codes (codeid, codevalue)
select CAST(CAST(NEWID() AS BINARY(10)) + CAST(GETDATE() AS BINARY(6)) AS UNIQUEIDENTIFIER), codevalue from temp

Performance with standard GUID values:

SQL Server Execution Times: CPU time = 17250 ms, elapsed time = 15735 ms.

(585000 row(s) affected)

Performance with COMB GUID values:

SQL Server Execution Times: CPU time = 17500 ms, elapsed time = 16419 ms.

(585000 row(s) affected)

What am I missing? the COMB GUID values resulted in slightly longer times, presumably because of the additional conversions. I thought the point was to reduce the insert time by semi-ordering the GUIDS using the date for the last 6 bytes, but the performance gain appears non-existent.

like image 893
Chris Avatar asked Jul 20 '09 19:07

Chris


People also ask

What is comb GUID?

The COMB GUID (with embed date and time) becomes a sequential GUID, with each GUID being sequentially after the previous GUID. This works great for indexing and sorting. But you cant just replace or insert this time stamp anywhere in the GUID, any part of the GUID with the date/time, it depends on the Database Server.

Are GUIDs sequential?

Sequential GUIDs are not actually sequential. In normal circumstances, GUIDs being generated by the same computer will have gradually increasing Timestamp fields (with the other fields remaining constant).


2 Answers

I'd suggest that you're not seeing the order benefit because the target table has no PK. So, it's the conversion overhead you're seeing. IF it has a PK, the 585k rows must still be sorted on insert. How does SQL know it's semi-sorted?

Now, if it was 5,850 x 100 row inserts, then you may see some benefit because the new rows will go "at the end" not "in the middle" so reducing page splits and overhead.

I'd go further and say that the article is dated 2002, and is for SQL 2000, and has been overtaken by real life.

In SQL Server 2005 we have SEQUENTIAL GUIDs to allow strictly monotonic GUIDs to solve some issues. The GUID as PK has been done here too: recent example: INT vs Unique-Identifier for ID field in database with 3rd party links.

If an ORM dictates GUID as a PK rather than a natural key or standard int-based surrogate key, that's a severe limitation of the ORM. And a case of the client tail wagging the database dog.

like image 119
gbn Avatar answered Sep 19 '22 23:09

gbn


I second that you'll see differences only when you have indexes (PK, FK or other kind of indexes, clustered or not clustered) on the Guid colume, because cost of standard guid versus newguid or comb guid is due to the high cost of re-ordering the index data every time an insert is performed.

See my question in which I corroborate this with some real life data from both SQL Server and Oracle.

like image 45
massimogentilini Avatar answered Sep 21 '22 23:09

massimogentilini