Why does an insert that groups by the primary key throw a primary key constraint violation error?

Tags:

I have an insert statement that's throwing a primary key error but I don't see how I could possibly be inserting duplicate key values.

First I create a temp table with a primary key.

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED //Note: I've tried committed and uncommited, neither materially affects the behavior. See screenshots below for proof.

IF (OBJECT_ID('TEMPDB..#P')) IS NOT NULL DROP TABLE #P;

CREATE TABLE #P(idIsbn INT NOT NULL PRIMARY KEY, price SMALLMONEY, priceChangedDate DATETIME);

Then I pull prices from the Prices table, grouping by idIsbn, which is the primary key in the temp table.

INSERT  INTO #P(idIsbn, price, priceChangedDate)
SELECT  idIsbn ,
        MIN(lowestPrice) ,
        MIN(priceChangedDate)
FROM Price p
WHERE p.idMarketplace = 3100
GROUP BY p.idIsbn

I understand that grouping by idIsbn by definition makes it unique. The idIsbn in the prices table is: [idIsbn] [int] NOT NULL.

But every once in a while when I run this query I get this error:

Violation of PRIMARY KEY constraint 'PK__#P________AED35F8119E85FC5'. Cannot insert duplicate key in object 'dbo.#P'. The duplicate key value is (1447858).

NOTE: I've got a lot of questions about timing. I will select this statement, press F5, and no error will occur. Then I'll do it again, and it will fail, then I'll run it again and again and it will succeed a couple times before it fails again. I guess what I'm saying is that I can find no pattern for when it will succeed and when it won't.

How can I be inserting duplicate rows if (A) I just created the table brand new before inserting into it and (B) I'm grouping by the column designed to be the primary key?

For now, I'm solving the problem with IGNORE_DUP_KEY = ON, but I'd really like to know the root cause of the problem.

Here is what I'm actually seeing in my SSMS window. There is nothing more and nothing less:

enter image description here

@@Version is:

Microsoft SQL Server 2008 (SP3) - 10.0.5538.0 (X64) 
    Apr  3 2015 14:50:02 
    Copyright (c) 1988-2008 Microsoft Corporation
    Standard Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

Execution Plan: enter image description here

Here is an example of what it looks like when it runs fine. Here I'm using READ COMMITTED, but it doesn't matter b/c I get the error no matter whether I read it committed or uncommited. enter image description here

Here is another example of it failing, this time w/ READ COMMITTED.

enter image description here

Also:

I get the same error whether I'm populating a temp table or a persistent table.
When I add option (maxdop 1) to the end of the insert it seems to fail every time, though I can't be exhaustively sure of that b/c I can't run it for infinity. But it seems to be the case.

Here is the definition of the price table. Table has 25M rows. 108,529 updates in the last hour.

CREATE TABLE [dbo].[Price](
    [idPrice] [int] IDENTITY(1,1) NOT NULL,
    [idIsbn] [int] NOT NULL,
    [idMarketplace] [int] NOT NULL,
    [lowestPrice] [smallmoney] NULL,
    [offers] [smallint] NULL,
    [priceDate] [smalldatetime] NOT NULL,
    [priceChangedDate] [smalldatetime] NULL,
 CONSTRAINT [pk_Price] PRIMARY KEY CLUSTERED 
(
    [idPrice] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
 CONSTRAINT [uc_idIsbn_idMarketplace] UNIQUE NONCLUSTERED 
(
    [idIsbn] ASC,
    [idMarketplace] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

And the two non-clustered indexes:

CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_INC_idIsbn_lowestPrice_priceDate] ON [dbo].[Price]
(
    [idMarketplace] ASC
)
INCLUDE (   [idIsbn],
    [lowestPrice],
    [priceDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice] ON [dbo].[Price]
(
    [idMarketplace] ASC,
    [priceChangedDate] ASC
)
INCLUDE (   [idIsbn],
    [lowestPrice]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

350

asked Mar 30 '16 19:03

Trevor

1 Answers

You hadn't supplied your table structure.

This is a repro with some assumed details that causes the problem at read committed (NB: now you have supplied the definition I can see in your case updates to the priceChangedDate column will move rows around in the IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice index if that's the one being seeked)

Connection 1 (Set up tables)

USE tempdb;

CREATE TABLE Price
  (
     SomeKey          INT PRIMARY KEY CLUSTERED,
     idIsbn           INT IDENTITY UNIQUE,
     idMarketplace    INT DEFAULT 3100,
     lowestPrice      SMALLMONEY DEFAULT $1.23,
     priceChangedDate DATETIME DEFAULT GETDATE()
  );

CREATE NONCLUSTERED INDEX ix
  ON Price(idMarketplace)
  INCLUDE (idIsbn, lowestPrice, priceChangedDate);

INSERT INTO Price
            (SomeKey)
SELECT number
FROM   master..spt_values
WHERE  number BETWEEN 1 AND 2000
       AND type = 'P';

Connection 2

Concurrent DataModifications that move a row from the beginning of the seeked range `(3100,1)` to the end `(3100,2001)` and back again repeatedly.

USE tempdb;

WHILE 1=1
BEGIN
UPDATE Price SET SomeKey = 2001 WHERE SomeKey = 1
UPDATE Price SET SomeKey = 1 WHERE SomeKey = 2001
END

Connection 3 (Do the insert into a temp table with a unique constraint)

USE tempdb;

CREATE TABLE #P
  (
     idIsbn           INT NOT NULL PRIMARY KEY,
     price            SMALLMONEY,
     priceChangedDate DATETIME
  );

WHILE 1 = 1
  BEGIN
      TRUNCATE TABLE #P

      INSERT INTO #P
                  (idIsbn,
                   price,
                   priceChangedDate)
      SELECT idIsbn,
             MIN(lowestPrice),
             MIN(priceChangedDate)
      FROM   Price p
      WHERE  p.idMarketplace = 3100
      GROUP  BY p.idIsbn
  END

enter image description here

The plan has no aggregate as there is a unique constraint on idIsbn (a unique constraint on idIsbn,idMarketplace would also work) therefore the group by can be optimised out as there are no duplicate values.

But at read committed isolation level shared row locks are released as soon as the row is read. So it is possible for a row to move places and be read a second time by the same seek or scan.

The index ix doesn't explicitly include SomeKey as a secondary key column but as it is not declared unique SQL Server silently includes the clustering key behind the scenes, hence updating that column value can move rows around in it.

181

answered Sep 17 '22 08:09

Martin Smith

Related questions
                            
                                SQLAlchemy: limit in the same string as where
                            
                                Stored procedure using SP_SEND_DBMAIL sending duplicate emails to all recipients
                            
                                SQL and fuzzy comparison
                            
                                How to get the number of newlines in a cell value in PostgreSQL
                            
                                Performance of nested select
                            
                                Why is the default size of nvarchar is 255 (MSSQL Server)?
                            
                                Can I use PDO::FETCH_GROUP to group results by two values
                            
                                oracle sql developer: 00904. 00000 - "%s: invalid identifier". Where is my fault?
                            
                                Best practices for the order of joined columns in a sql join?
                            
                                Inserting a child node in an XMLTYPE column
                            
                                How to Convert SQL table into Redis Data
                            
                                Trouble Creating a MySQL Function through PDO
                            
                                Mysql: inner join on primary key for 2 IDs gives "Range checked for each record"
                            
                                SELECT from table with Varying IN list in WHERE clause
                            
                                Join same table twice for count in different columns
                            
                                Querying a nullable @OneToOne relationship with JPA
                            
                                How to select max of count in PostgreSQL
                            
                                H2 DB - Column must be in Group By list
                            
                                Calling multi-statement TVF with different parameters in separate CTEs showing wrong results
                            
                                Get last message from each conversation

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why does an insert that groups by the primary key throw a primary key constraint violation error?

Tags:

sql

database

sql-server

Trevor

People also ask

1 Answers

Connection 1 (Set up tables)

Connection 2

Concurrent DataModifications that move a row from the beginning of the seeked range `(3100,1)` to the end `(3100,2001)` and back again repeatedly.

Connection 3 (Do the insert into a temp table with a unique constraint)

Martin Smith

Recent Activity

Donate For Us

Why does an insert that groups by the primary key throw a primary key constraint violation error?

Tags:

sql

database

sql-server

Trevor

People also ask

1 Answers

Connection 1 (Set up tables)

Connection 2

Concurrent DataModifications that move a row from the beginning of the seeked range (3100,1) to the end (3100,2001) and back again repeatedly.

Connection 3 (Do the insert into a temp table with a unique constraint)

Martin Smith

Related questions

Recent Activity

Donate For Us

Concurrent DataModifications that move a row from the beginning of the seeked range `(3100,1)` to the end `(3100,2001)` and back again repeatedly.