I have a time series that with null values. I want to be replace each null value with the most recent non-non value. From what I've researched, Oracle SQL can easily accomplish this using Last_value with IGNORE NULLS. Is there a similar way to accomplish this using SQL Server 2016? Otherwise I'm just going to code it using C#, but felt using SQL would be faster, cleaner, and easier.
Sec SCORE
1 Null
2 Null
3 5
4 Null
5 8
6 7
7 Null
Should be replaced with:
Sec SCORE
1 Null
2 Null
3 5
4 5
5 8
6 7
7 7
You can do this with two cumulative operations:
select t.*,
coalesce(score, max(score) over (partition by maxid)) as newscore
from (select t.*,
max(case when score is not null then id end) over (order by id) as maxid
from t
) t;
The innermost subquery gets the most recent id where there is a value. The outermost one "spreads" that value to the subsequent rows.
If you actually want to update the table, you can incorporate this easily into an update
. But, Oracle cannot do that (easily), so I'm guessing this is not necessary....
If performance is an issue, I suggest the solution from this article:
The Last non NULL Puzzle
His final solution, while dense, does perform excellently with a linear query plan without any joins. Here is an example implementation I've used which carries the last customer name through a type2 scd staging table. In this staging table, NULL represents no update, and '*** DELETED ***'
represents an explicit set to NULL. The following cleans this up to resemble an actual SCD record excellently:
WITH [SampleNumbered] AS (
SELECT *, ROW_NUMBER() OVER ( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [RowNumber]
FROM [dbo].[SampleDimStage]
), [SamplePrep] AS (
SELECT [SampleId]
, [StartDatetime]
, CAST([RowNumber] AS BINARY(8)) + CAST([SampleGroupId] AS VARBINARY(255)) AS [BinarySampleGroupId]
, CAST([RowNumber] AS BINARY(8)) + CAST([SampleStatusCode] AS VARBINARY(255)) AS [BinarySampleStatusCode]
FROM [SampleNumbered]
), [SampleCleanUp] AS (
SELECT [SampleId]
, [StartDatetime]
, CAST(SUBSTRING(MAX([BinarySampleGroupId]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
, 9, 255) AS VARCHAR(255)) AS [LastSampleGroupId]
, CAST(SUBSTRING(MAX([BinarySampleStatusCode]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
, 9, 255) AS VARCHAR(255)) AS [LastSampleStatusCode]
, LEAD([StartDatetime]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [EndDatetime]
FROM [SamplePrep]
)
SELECT CAST([SampleId] AS NUMERIC(18)) AS [SampleId]
, CAST(NULLIF([sc].[LastSampleGroupId],'*** DELETED ***') AS NUMERIC(18)) AS [GroupId]
, CAST(NULLIF([sc].[LastSampleStatusCode],'*** DELETED ***') AS CHAR(3)) AS [SampleStatusCode]
, [StartDatetime]
, [sc].[EndDatetime]
FROM [SampleCleanUp] [sc];
If your sort key is some sort of integer, you can completely skip the first CTE and cast that directly to binary.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With