deleting duplicate row with no unique identifier

Question

I have some data in a table that looks roughly like the following:

table stockData
(
tickId int not null,
timestamp datetime not null,
price decimal(18,5) not null
)

Neither tickId nor timestamp are unique, however the combination of tickId and timestamp is supposed to be unique.

I have some duplicate data in my table, and I'm attempting to remove it. However, I'm coming to the conclusion that there is not enough information with the given data for me to discern one row from the other, and basically no way for me to delete just one of the duplicate rows. My guess is that I will need to introduce some sort of identity column, which would help me to identify one row from the other.

Is this correct, or is there some magic way of deleting one but not both of the duplicate data with a query?

EDIT Edited to clarify that tickId and timestamp combo should be unique, but it's not because of the duplicate data.

Sergey Rybalkin · Accepted Answer

Here is a query that will remove duplicates and leave exactly one copy of each unique row. It will work with SQL Server 2005 or higher:

WITH Dups AS
(
  SELECT tickId, timestamp, price,
    ROW_NUMBER() OVER(PARTITION BY tickid, timestamp ORDER BY (SELECT 0)) AS rn
  FROM stockData
)
DELETE FROM Dups WHERE rn > 1

deleting duplicate row with no unique identifier

Tags:

sql

sql-server

steve8918

1 Answers

Sergey Rybalkin

Recent Activity

Donate For Us

deleting duplicate row with no unique identifier

Tags:

sql

sql-server

steve8918

1 Answers

Sergey Rybalkin

Related questions

Recent Activity

Donate For Us