Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Last_value with IGNORE NULLS in SQL Server

Tags:

c#

sql

sql-server

I have a time series that with null values. I want to be replace each null value with the most recent non-non value. From what I've researched, Oracle SQL can easily accomplish this using Last_value with IGNORE NULLS. Is there a similar way to accomplish this using SQL Server 2016? Otherwise I'm just going to code it using C#, but felt using SQL would be faster, cleaner, and easier.

Sec SCORE
1   Null
2   Null
3   5
4   Null
5   8
6   7
7   Null

Should be replaced with:

Sec SCORE
1   Null
2   Null
3   5
4   5
5   8
6   7
7   7
like image 410
Trevor D Avatar asked Jul 03 '17 21:07

Trevor D


2 Answers

You can do this with two cumulative operations:

select t.*,
       coalesce(score, max(score) over (partition by maxid)) as newscore
from (select t.*,
             max(case when score is not null then id end) over (order by id) as maxid
      from t
     ) t;

The innermost subquery gets the most recent id where there is a value. The outermost one "spreads" that value to the subsequent rows.

If you actually want to update the table, you can incorporate this easily into an update. But, Oracle cannot do that (easily), so I'm guessing this is not necessary....

like image 57
Gordon Linoff Avatar answered Oct 17 '22 16:10

Gordon Linoff


If performance is an issue, I suggest the solution from this article:

The Last non NULL Puzzle

His final solution, while dense, does perform excellently with a linear query plan without any joins. Here is an example implementation I've used which carries the last customer name through a type2 scd staging table. In this staging table, NULL represents no update, and '*** DELETED ***' represents an explicit set to NULL. The following cleans this up to resemble an actual SCD record excellently:

WITH [SampleNumbered] AS (
    SELECT  *, ROW_NUMBER() OVER ( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [RowNumber]
    FROM [dbo].[SampleDimStage]
), [SamplePrep] AS (
    SELECT  [SampleId]
        ,   [StartDatetime]
        ,   CAST([RowNumber] AS BINARY(8)) + CAST([SampleGroupId] AS VARBINARY(255)) AS [BinarySampleGroupId]
        ,   CAST([RowNumber] AS BINARY(8)) + CAST([SampleStatusCode] AS VARBINARY(255)) AS [BinarySampleStatusCode]
    FROM [SampleNumbered]
), [SampleCleanUp] AS (
    SELECT  [SampleId]
        ,   [StartDatetime]
        ,   CAST(SUBSTRING(MAX([BinarySampleGroupId]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
                , 9, 255) AS VARCHAR(255)) AS [LastSampleGroupId]
        ,   CAST(SUBSTRING(MAX([BinarySampleStatusCode]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] )
                , 9, 255) AS VARCHAR(255)) AS [LastSampleStatusCode]
        ,   LEAD([StartDatetime]) OVER( PARTITION BY [SampleId] ORDER BY [StartDatetime] ) AS [EndDatetime]
    FROM [SamplePrep]
)
SELECT  CAST([SampleId] AS NUMERIC(18)) AS [SampleId]
    ,   CAST(NULLIF([sc].[LastSampleGroupId],'*** DELETED ***') AS NUMERIC(18)) AS [GroupId]
    ,   CAST(NULLIF([sc].[LastSampleStatusCode],'*** DELETED ***') AS CHAR(3)) AS [SampleStatusCode]
    ,   [StartDatetime]
    ,   [sc].[EndDatetime]
FROM [SampleCleanUp] [sc];

If your sort key is some sort of integer, you can completely skip the first CTE and cast that directly to binary.

like image 3
Apneal Avatar answered Oct 17 '22 15:10

Apneal