Here's a question I've been racking my brain over. Let's say I have a table that has a series of timestamps and a part number as the primary key. The table stores incremental changes, meaning that for every timestamp, if a field changes, that change is recorded. If the field doesn't change, then for the new timestamp it is NULL. Here's the basic idea.
part | timestamp | x-pos | y-pos | status
------+-----------+-------+-------+--------
a5 | 151 | 5 | 15 | g
a5 | 153 | NULL | 17 | NULL
(part, timestamp)
is the primary key. The NULL
s in the second record indicate values that are unchanged since the first record.
What I want to be able to do is select the most recent values for each field grouped by the part. For example, given the above entries, the results will be 153,5,17,g for part a5.
As of now, I have this hacked together query.
((SELECT x-pos FROM part_changes WHERE x-pos IS NOT NULL
ORDER BY timestamp DESC
LIMIT 1)
UNION
(SELECT y-pos FROM part_changesWHERE y-pos IS NOT NULL
ORDER BY timestamp DESC
LIMIT 1)
UNION
(SELECT status FROM part_changes WHERE status IS NOT NULL
ORDER BY timestamp DESC
LIMIT 1))
But this returns a single column, meaning that I can use a group-by for organizing.
There's got to be a more elegant way of doing thing, such as using COALESCE or IS NULL in a creative way. But I'm stuck and can't figure it out. Anybody got an idea?
And no, I can't change the database structure.
EDIT: ruakh has the right idea. The only problem now is grouping by part. I can't seem to get around the LIMIT 1
for grouping by multiple parts. Any ideas?
mdahlman, I'm not too familiar with analytic functions in postgresql. So, if that solution would be easier than a complex query, then by all means post your idea.
EDIT 2: Thank you all for the help. I think I've got a good enough grasp of what I need to do.
Rather than using a UNION
, it sounds like you really want subqueries in the field list. That is, instead of (SELECT ...) UNION (SELECT ...) UNION (SELECT ...)
, you want SELECT (SELECT ...), (SELECT ...), (SELECT ...)
.
For example:
SELECT part,
( SELECT x_pos
FROM part_changes
WHERE part = pc.part
AND x_pos IS NOT NULL
ORDER
BY timestamp DESC
LIMIT 1
) AS x_pos,
( SELECT y_pos
FROM part_changes
WHERE part = pc.part
AND y_pos IS NOT NULL
ORDER
BY timestamp DESC
LIMIT 1
) AS y_pos,
( SELECT status
FROM part_changes
WHERE part = pc.part
AND status IS NOT NULL
ORDER
BY timestamp DESC
LIMIT 1
) AS status
FROM ( SELECT DISTINCT
part
FROM part_changes
) AS pc
;
But at this point I would really consider writing a stored procedure.
Alternatively:
SELECT DISTINCT
part,
FIRST_VALUE(x_pos) OVER
( PARTITION BY part
ORDER BY CASE WHEN x_pos IS NULL
THEN NULL
ELSE TIMESTAMP
END DESC NULLS LAST
) AS x_pos,
FIRST_VALUE(y_pos) OVER
( PARTITION BY part
ORDER BY CASE WHEN y_pos IS NULL
THEN NULL
ELSE TIMESTAMP
END DESC NULLS LAST
) AS y_pos,
FIRST_VALUE(status) OVER
( PARTITION BY part
ORDER BY CASE WHEN status IS NULL
THEN NULL
ELSE TIMESTAMP
END DESC NULLS LAST
) AS status
FROM part_changes
;
For only one part this should give you an answer .. thanks to ruakh
But I dont like this version ..
SELECT
(SELECT timestamp FROM part_changes WHERE part = $part
ORDER BY timestamp DESC
LIMIT 1) as timestamp,
(SELECT x-pos FROM part_changes WHERE part = $part and x-pos IS NOT NULL
ORDER BY timestamp DESC
LIMIT 1) as xpos,
(SELECT y-pos FROM part_changes WHERE part = $part and y-pos IS NOT NULL
ORDER BY timestamp DESC
LIMIT 1) as ypos,
(SELECT status FROM part_changes WHERE part = $part and status IS NOT NULL
ORDER BY timestamp DESC
LIMIT 1)) as status
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With