Consider a table like with the following data
column_a (boolean) | column_order (integer)
TRUE | 1
NULL | 2
NULL | 3
TRUE | 4
NULL | 5
FALSE | 6
NULL | 7
I would like to write a queries that replaces each NULL
value in column_a
with the last non-NULL
value out of the previous values of the column according to the order specified by column_order
The result should look like:
column_a (boolean) | column_order (integer)
TRUE | 1
TRUE | 2
TRUE | 3
TRUE | 4
TRUE | 5
FALSE | 6
FALSE | 7
For simplicity, we can assume that the first value is never null. The following works if there are no more than one consecutive NULL
values:
SELECT
COALESCE(column_a, lag(column_a) OVER (ORDER BY column_order))
FROM test_table
ORDER BY column_order;
However, the above does not work for an arbitrary number of consecutive NULL
values. What is a Postgres query that is able to achieve the results above? Is there an efficient query that scales well to a large number of rows?
With a subquery in the SELECT clause The first method uses a subquery inside the SELECT clause to get the first non-NULL value before the current row. First, we create a subquery that returns the first non-NULL value before the current row. The last non-NULL value carries forward but only after the starting row.
UPDATE [table] SET [column]=0 WHERE [column] IS NULL; Null Values can be replaced in SQL by using UPDATE, SET, and WHERE to search a column in a table for nulls and replace them. In the example above it replaces them with 0.
Below is the syntax to filter the rows without a null value in a specified column. Syntax: SELECT * FROM <table_name> WHERE <column_name> IS NOT NULL; Example: SELECT * FROM demo_orders WHERE ORDER_DATE IS NOT NULL; --Will output the rows consisting of non null order_date values.
To exclude the null values from the table we need to use IS NOT NULL operator with the WHERE clause.
You can use a handy trick where you sum
over a case
to create partitions based on the divisions between null and non-null series, then first_value
to bring them forward.
e.g.
select
*,
sum(case when column_a is not null then 1 else 0 end)
OVER (order by column_order) as partition
from table1;
column_a | column_order | partition
----------+--------------+-----------
t | 1 | 1
| 2 | 1
| 3 | 1
t | 4 | 2
| 5 | 2
f | 6 | 3
| 7 | 3
(7 rows)
then
select
first_value(column_a)
OVER (PARTITION BY partition ORDER BY column_order),
column_order
from (
select
*,
sum(case when column_a is not null then 1 else 0 end)
OVER (order by column_order) as partition
from table1
) partitioned;
gives you:
first_value | column_order
-------------+--------------
t | 1
t | 2
t | 3
t | 4
t | 5
f | 6
f | 7
(7 rows)
Not sure if Postgresql supports this, but give it a try:
SELECT
COALESCE(column_a, (select t2.column_a from test_table t2
where t2.column_order < t1.column_order
and t2.column_a is not null
order by t2.column_order desc
fetch first 1 row only))
FROM test_table t1
ORDER BY column_order;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With