-- Scroll down for an edit that I added --
So here's my scenario. I have a table that has an entry for every time someone makes a change to some data. The reason for this is that we need to be able to audit all the changes.
However, I only want to retrieve the latest record for a series of edits a user has made.
So let's say there are three users, user A, B and C.
User A makes 10 changes (10 entries in the table). User B makes 5 changes User A makes 3 more changes User C makes 2 changes
What I want to get back is: Most recent of the 2 records that C created Most recent of the 3 records that A created Most recent of the 5 records that B created Most recent of the 10 records that A created
For a total of 4 rows that I get back
Here's what I tried, but the problem is that the RowNum doesn't go back to 1 when the LastUpdatedBy changes:
WITH cte AS
(
SELECT
[LastUpdatedOn]
,[LastUpdatedBy]
,ROW_NUMBER() OVER(PARTITION BY [LastUpdatedBy] ORDER BY [LastUpdatedOn] DESC) [RowNum]
FROM [HistoricalTable]
)
SELECT
[LastUpdatedOn]
,[LastUpdatedBy]
,RowNum
FROM cte
--WHERE RowNum = 1
ORDER BY [LastUpdatedOn] DESC;
And here's the output I get (** asterisks represent the rows I want back)
LastUpdatedOn LastUpdatedBy RowNum
**2011-06-07 13:07:26.917 629 1**
2011-06-07 12:57:53.700 629 2
2011-06-07 12:57:44.387 629 3
2011-06-07 12:57:34.913 629 4
2011-06-07 12:57:25.040 629 5
2011-06-07 12:57:19.927 629 6
2011-06-07 12:55:17.460 629 7
2011-06-07 12:55:12.287 629 8
2011-06-07 12:30:34.377 629 9
**2011-06-07 11:54:05.727 4 1**
**2011-06-07 11:50:02.723 629 10** (If this number went back to 1, my query would have worked fine)
2011-06-07 11:26:43.053 629 11
2011-06-07 10:54:32.867 629 12
2011-06-07 10:46:32.107 629 13
2011-06-07 10:40:52.937 629 14
**2011-06-07 10:39:50.880 3 1**
------------------- EDIT --------------------
So I came up with a solution, but it's not very elegant and not sure if I like it, but it does the trick. This might give you a better understanding of what I'm trying to accomplish.
DECLARE @temp AS TABLE(LastUpdatedOn datetime, LastUpdatedBy int null, RowNum int);
DECLARE @newTable AS TABLE(LastUpdatedOn datetime, LastUpdatedBy int null);
DECLARE @lastUserId int = 0;
INSERT INTO @temp
SELECT
[LastUpdatedOn]
,[LastUpdatedBy]
,ROW_NUMBER() OVER(ORDER BY [LastUpdatedOn] DESC) [RowNum]
FROM [HistoricalTable]
DECLARE @totalRecords int;
SELECT @totalRecords = COUNT(*) FROM @temp;
DECLARE @counter int = 0;
WHILE @counter <= @totalRecords BEGIN
SET @counter = @counter + 1;
INSERT INTO @newTable
SELECT LastUpdatedOn, LastUpdatedBy
FROM @temp
WHERE RowNum = @counter AND (@lastUserId != LastUpdatedBy OR (LastUpdatedBy IS NULL));
SELECT @lastUserId = LastUpdatedBy FROM @temp WHERE RowNum = @counter;
END
SELECT * FROM @newTable;
And the data that is returned:
LastUpdatedOn LastUpdatedBy
2011-06-07 13:07:26.917 629
2011-06-07 11:54:05.727 4
2011-06-07 11:50:02.723 629
2011-06-07 10:39:50.880 3
In SQL Server, we can easily select the last 10 records from a table by using the “SELECT TOP” statement. The TOP clause in SQL Server is used to control the number or percentage of rows from the result. And to select the records from the last, we have to arrange the rows in descending order.
The GROUP BY Clause SQL is used to group rows with same values. The GROUP BY Clause is used together with the SQL SELECT statement. The SELECT statement used in the GROUP BY clause can only be used contain column names, aggregate functions, constants and expressions.
Let's say you want to get last record in each group, that is, for each product. First we use GROUP BY to get most recent date for each group. Now that we know the most recent date for each group, we join this result with our original table to get latest record by date group.
DISTINCT is used to filter unique records out of all records in the table. It removes the duplicate rows. SELECT DISTINCT will always be the same, or faster than a GROUP BY.
;with cte as
(
select *,
row_number() over(order by LastUpdatedOn) as rn
from HistoricalTable
)
select C1.LastUpdatedOn,
C1.LastUpdatedBy
from cte as C1
left outer join cte as C2
on C1.rn = C2.rn-1
where C1.LastUpdatedBy <> coalesce(C2.LastUpdatedBy, 0)
Creating a row number for each row order by LastUpdatedOn
and joining to the next row and comparing if LastUpdatedBy is changed.
Beware of this coalesce(C2.LastUpdatedBy, 0)
. It is to get the last row and the 0
need to be some integer value not used as LastUpdatedBy
.
Not sure if i'm missing something in your question but doesn't the following SQL answer the problem ?
declare @HistoricalTable table (LastUpdatedOn datetime, LastUpdatedBy int);
insert into @HistoricalTable (LastUpdatedOn, LastUpdatedBy) values
('2011-06-07 13:07:26.917', 629),('2011-06-07 12:57:53.700', 629),
('2011-06-07 12:57:44.387', 629),('2011-06-07 12:57:34.913', 629),
('2011-06-07 12:57:25.040', 629),('2011-06-07 12:57:19.927', 629),
('2011-06-07 12:55:17.460', 629),('2011-06-07 12:55:12.287', 629),
('2011-06-07 12:30:34.377', 629),('2011-06-07 11:54:05.727', 4),
('2011-06-07 11:50:02.723', 629),('2011-06-07 11:26:43.053', 629),
('2011-06-07 10:54:32.867', 629),('2011-06-07 10:46:32.107', 629),
('2011-06-07 10:40:52.937', 629),('2011-06-07 10:39:50.880', 3);
select
latest.*
from
(
select *, rank() over (partition by LastUpdatedBy order by LastUpdatedOn desc) as UpdateRank
from @HistoricalTable
) latest
where
latest.UpdateRank = 1
order by
latest.LastUpdatedBy;
LastUpdatedOn LastUpdatedBy UpdateRank
2011-06-07 10:39:50.880 3 1
2011-06-07 11:54:05.727 4 1
2011-06-07 13:07:26.917 629 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With