Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server Query Returning Multiple Rows

I am currently working on a SSIS package that is extracting a table from one database to another. The tables in both of the databases use the same column as the primary key. My select statement to extract the data is a simple select statement. When I ran the package I was receiving a error that there where duplicate primary key values.

I reviewed my select statement and verified that my select statement was not returning duplicate rows. So to test this I removed the primary key from the table that I am inserting the data into and reran the SSIS package. After it ran I looked at the table to see what rows where being duplicated. What I found was that rows that where being edited while the extract was running where being duplicated, there was a record of it before the edit, and a record of it after the edit. I could easily tell this because the table has a last modified field that gets updated every time a record is updated.

I added a NOLOCK hint to my select statement, and it stopped returning duplicate rows.

So my question is why? I would have expected that a select statement with a NOLOCK table hint would have a higher chance of returning duplicate rows because it is not using locking, and that a select statement without the NOLOCK hint should use locking to make sure that it does not return duplicate rows.

Here is the select statement that I am using to select the data. I did verify that the joins are not causing it to duplicate rows:

SELECT pe.enc_id,
       pe.enc_nbr,
       pe.billable_ind,
       pe.clinical_ind AS clinical_ind,
       pe.budget_ind,
       pe.print_stmt_ind,
       pe.send_coll_letter_ind,
       pe.outsource_exempt_ind,
       cb.First_name + ' ' + cb.last_name AS CreatedBy,
       pe.create_timestamp AS create_timestamp,
       mb.first_name + ' ' + mb.last_name AS ModifiedBy,
       pe.modify_timestamp AS modify_timestamp
FROM   patient_encounter pe WITH(NOLOCK) 
       LEFT OUTER JOIN user_mstr cb WITH(NOLOCK) ON
           pe.created_by = cb.user_id
       LEFT OUTER JOIN user_mstr mb WITH(NOLOCK) ON
           pe.modified_by = mb.user_id
like image 469
Eric Maibach Avatar asked Jun 10 '26 03:06

Eric Maibach


1 Answers

NOLOCK hint causes dirty read anomallies, and one such anomaly is a duplicate read. Such reads are frequent if an update changes the position of the row in the index scanned by the query:

  • say you have 2 rows in the table, with an ID key, rows with key values 1 and 2
  • one request (T1) runs UPDATE table SET key=3 WHERE key=1;
  • second request (T2) runs SELECT ... FROM table WITH(NOLOCK);
  • T1 locks the row with key value 1
  • T2 ignores the lock T1 has and reads the row with key value 1
  • T2 continue and reads row with key value 2
  • T1 update the row, and the row is moved in the index int he new position for key value 3
  • T2 continues to scan and reads the row with key value 3

So the SELECT has read a row twice, once while it had the key value 1 and once while it had a key value 3. This is just a trivial example of what can happen. In reality more complicated queries can run complex plans and use other indexes, all subject to such anomalies.

In short: NOLOCK hint is evil. If you want to avoid contention, use snapshot isolation.

like image 172
Remus Rusanu Avatar answered Jun 12 '26 23:06

Remus Rusanu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!