Consider a table like with the following data <pre class="prettyprint"><code>column_a (boolean) | column_order (integer) TRUE | 1 NULL | 2 NULL | 3 TRUE | 4 NULL | 5 FALSE | 6 NULL | 7 </code></pre> I would like to write a queries that replaces each <code>NULL</code> value in <code>column_a</code> with the last non-<code>NULL</code> value out of the previous values of the column according to the order specified by <code>column_order</code> The result should look like: <pre class="prettyprint"><code>column_a (boolean) | column_order (integer) TRUE | 1 TRUE | 2 TRUE | 3 TRUE | 4 TRUE | 5 FALSE | 6 FALSE | 7 </code></pre> For simplicity, we can assume that the first value is never null. The following works if there are no more than one consecutive <code>NULL</code> values: <pre class="prettyprint"><code>SELECT COALESCE(column_a, lag(column_a) OVER (ORDER BY column_order)) FROM test_table ORDER BY column_order; </code></pre> However, the above does not work for an arbitrary number of consecutive <code>NULL</code> values. What is a Postgres query that is able to achieve the results above? Is there an efficient query that scales well to a large number of rows?

You can use a handy trick where you <code>sum</code> over a <code>case</code> to create partitions based on the divisions between null and non-null series, then <code>first_value</code> to bring them forward. e.g. <pre class="prettyprint"><code>select *, sum(case when column_a is not null then 1 else 0 end) OVER (order by column_order) as partition from table1; column_a | column_order | partition ----------+--------------+----------- t | 1 | 1 | 2 | 1 | 3 | 1 t | 4 | 2 | 5 | 2 f | 6 | 3 | 7 | 3 (7 rows) </code></pre> then <pre class="prettyprint"><code>select first_value(column_a) OVER (PARTITION BY partition ORDER BY column_order), column_order from ( select *, sum(case when column_a is not null then 1 else 0 end) OVER (order by column_order) as partition from table1 ) partitioned; </code></pre> gives you: <pre class="prettyprint"><code> first_value | column_order -------------+-------------- t | 1 t | 2 t | 3 t | 4 t | 5 f | 6 f | 7 (7 rows) </code></pre>

Not sure if Postgresql supports this, but give it a try: <pre class="prettyprint"><code>SELECT COALESCE(column_a, (select t2.column_a from test_table t2 where t2.column_order < t1.column_order and t2.column_a is not null order by t2.column_order desc fetch first 1 row only)) FROM test_table t1 ORDER BY column_order; </code></pre>

Update ordered row with last not-null value [duplicate]

Tags:

sql

postgresql

Consider a table like with the following data

column_a (boolean) | column_order (integer)
TRUE               |     1
NULL               |     2
NULL               |     3
TRUE               |     4
NULL               |     5
FALSE              |     6
NULL               |     7

I would like to write a queries that replaces each NULL value in column_a with the last non-NULL value out of the previous values of the column according to the order specified by column_order The result should look like:

column_a (boolean) | column_order (integer)
TRUE               |     1
TRUE               |     2
TRUE               |     3
TRUE               |     4
TRUE               |     5
FALSE              |     6
FALSE              |     7

For simplicity, we can assume that the first value is never null. The following works if there are no more than one consecutive NULL values:

SELECT
  COALESCE(column_a, lag(column_a) OVER (ORDER BY column_order))
FROM test_table
ORDER BY column_order;

However, the above does not work for an arbitrary number of consecutive NULL values. What is a Postgres query that is able to achieve the results above? Is there an efficient query that scales well to a large number of rows?

211

asked Aug 26 '15 15:08

Marco

2 Answers

You can use a handy trick where you sum over a case to create partitions based on the divisions between null and non-null series, then first_value to bring them forward.

e.g.

select
  *,
  sum(case when column_a is not null then 1 else 0 end)
    OVER (order by column_order) as partition
from table1;

 column_a | column_order | partition 
----------+--------------+-----------
 t        |            1 |         1
          |            2 |         1
          |            3 |         1
 t        |            4 |         2
          |            5 |         2
 f        |            6 |         3
          |            7 |         3
(7 rows)

then

select
  first_value(column_a)
    OVER (PARTITION BY partition ORDER BY column_order),
  column_order
from (
    select
      *,
      sum(case when column_a is not null then 1 else 0 end)
        OVER (order by column_order) as partition
    from table1
) partitioned;

gives you:

 first_value | column_order 
-------------+--------------
 t           |            1
 t           |            2
 t           |            3
 t           |            4
 t           |            5
 f           |            6
 f           |            7
(7 rows)

101

answered Oct 11 '22 14:10

Craig Ringer

Not sure if Postgresql supports this, but give it a try:

SELECT
  COALESCE(column_a, (select t2.column_a from test_table t2
                      where t2.column_order < t1.column_order
                        and t2.column_a is not null
                      order by t2.column_order desc
                      fetch first 1 row only))
FROM test_table t1
ORDER BY column_order;

answered Oct 11 '22 14:10

jarlh

Related questions
                            
                                Join two tables - One common column with different values
                            
                                There was also a ROLLBACK ERROR and tSQLt.ExpectException
                            
                                MYSQL innoDB SELECT FOR UPDATE with LEFT JOIN
                            
                                How to convert RIGHT LEFT functions to codeigniter active record
                            
                                Linq to 3 tables with no foreign keys
                            
                                MySQL - SELECT AVG on some rows and SUM on all
                            
                                How to convert multiple rows into one row with multiple columns using Pivot in SQL Server when data having NULL values
                            
                                Group rows with similar strings
                            
                                GBQ window function AND arithmetic operations
                            
                                SQL: Select two columns by single column in group by with only having condition
                            
                                Are there SQL datatypes that don't work with R?
                            
                                Oracle SQL: How to INSERT a SELECT statement with a GROUP BY clause on a table with IDENTITY column?
                            
                                How to shift column values in MySQL?
                            
                                Guid with extra characters issue
                            
                                Using sequence.nextval in subquery
                            
                                SQL Server - Selecting periods without changes in data
                            
                                Strategies for checking ISNULL on varbinary fields?
                            
                                Add Day to Timestamp
                            
                                Selecting rows with the highest date
                            
                                How to INSERT to a column whose name is a sql keyword

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With