Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database: Select last non-null entries

Here's a question I've been racking my brain over. Let's say I have a table that has a series of timestamps and a part number as the primary key. The table stores incremental changes, meaning that for every timestamp, if a field changes, that change is recorded. If the field doesn't change, then for the new timestamp it is NULL. Here's the basic idea.

 part | timestamp | x-pos | y-pos | status
------+-----------+-------+-------+--------
 a5   |       151 |     5 |    15 |      g
 a5   |       153 |  NULL |    17 |   NULL

(part, timestamp) is the primary key. The NULLs in the second record indicate values that are unchanged since the first record.

What I want to be able to do is select the most recent values for each field grouped by the part. For example, given the above entries, the results will be 153,5,17,g for part a5.

As of now, I have this hacked together query.

    ((SELECT x-pos FROM part_changes WHERE x-pos IS NOT NULL
    ORDER BY timestamp DESC
    LIMIT 1)

    UNION

    (SELECT y-pos FROM part_changesWHERE y-pos IS NOT NULL
    ORDER BY timestamp DESC
    LIMIT 1)

    UNION

    (SELECT status FROM part_changes WHERE status IS NOT NULL
    ORDER BY timestamp DESC
    LIMIT 1))

But this returns a single column, meaning that I can use a group-by for organizing.

There's got to be a more elegant way of doing thing, such as using COALESCE or IS NULL in a creative way. But I'm stuck and can't figure it out. Anybody got an idea?

And no, I can't change the database structure.

EDIT: ruakh has the right idea. The only problem now is grouping by part. I can't seem to get around the LIMIT 1 for grouping by multiple parts. Any ideas?

mdahlman, I'm not too familiar with analytic functions in postgresql. So, if that solution would be easier than a complex query, then by all means post your idea.

EDIT 2: Thank you all for the help. I think I've got a good enough grasp of what I need to do.

like image 810
Bat Masterson Avatar asked Jan 23 '12 19:01

Bat Masterson


2 Answers

Rather than using a UNION, it sounds like you really want subqueries in the field list. That is, instead of (SELECT ...) UNION (SELECT ...) UNION (SELECT ...), you want SELECT (SELECT ...), (SELECT ...), (SELECT ...).


For example:

SELECT part,
       ( SELECT x_pos
           FROM part_changes
          WHERE part = pc.part
            AND x_pos IS NOT NULL
          ORDER
             BY timestamp DESC
          LIMIT 1
       ) AS x_pos,
       ( SELECT y_pos
           FROM part_changes
          WHERE part = pc.part
            AND y_pos IS NOT NULL
          ORDER
             BY timestamp DESC
          LIMIT 1
       ) AS y_pos,
       ( SELECT status
           FROM part_changes
          WHERE part = pc.part
            AND status IS NOT NULL
          ORDER
             BY timestamp DESC
          LIMIT 1
       ) AS status
  FROM ( SELECT DISTINCT
                part
           FROM part_changes
       ) AS pc
;

But at this point I would really consider writing a stored procedure.


Alternatively:

SELECT DISTINCT
       part,
       FIRST_VALUE(x_pos) OVER
         ( PARTITION BY part
               ORDER BY CASE WHEN x_pos IS NULL
                             THEN NULL
                             ELSE TIMESTAMP
                         END DESC NULLS LAST
         ) AS x_pos,
       FIRST_VALUE(y_pos) OVER
         ( PARTITION BY part
               ORDER BY CASE WHEN y_pos IS NULL
                             THEN NULL
                             ELSE TIMESTAMP
                         END DESC NULLS LAST
         ) AS y_pos,
       FIRST_VALUE(status) OVER
         ( PARTITION BY part
               ORDER BY CASE WHEN status IS NULL
                             THEN NULL
                             ELSE TIMESTAMP
                         END DESC NULLS LAST
         ) AS status
  FROM part_changes
;
like image 84
ruakh Avatar answered Oct 29 '22 09:10

ruakh


For only one part this should give you an answer .. thanks to ruakh

But I dont like this version ..

SELECT 
    (SELECT timestamp  FROM part_changes WHERE part = $part 
    ORDER BY timestamp DESC
    LIMIT 1) as timestamp,

    (SELECT x-pos FROM part_changes WHERE part = $part and x-pos IS NOT NULL
    ORDER BY timestamp DESC
    LIMIT 1) as xpos,

    (SELECT y-pos FROM part_changes WHERE part = $part and  y-pos IS NOT NULL
    ORDER BY timestamp DESC
    LIMIT 1) as ypos,

    (SELECT status FROM part_changes WHERE part = $part and status IS NOT NULL
    ORDER BY timestamp DESC
    LIMIT 1)) as status
like image 35
rauschen Avatar answered Oct 29 '22 09:10

rauschen