I have a table to save a process.
Every process is composed by items and every item has its values that are collected during the process.
A process is executed by a client.
Here is the sample database scheme with dummy data: http://sqlfiddle.com/#!15/36af4
There is some info that I need to extract from these tables:
I also need to find the process ID and client ID of the newest and oldest process of an item per life:
Please note, that for a specific item in a specific life, the process ID of the oldest process could not match the process ID of the min value of that item in that life.
I need all this info for every life of every item. An item can have a process in different clients so I cannot group by client because this would make a possible duplicate of the item. The same goes for the processes, because an item can be present in different processes I cannot group by process too.
This is the most far I could go for myself:
SELECT
PV.ID_ITEM AS ID_ITEM,
PV.ITEM_LIFE AS LIFE,
COUNT(PV.ID_ITEM) AS TOTAL_ITEM_PROCESS,
MIN(P.DATE_TIME) AS OLDEST_PROCESS,
MAX(P.DATE_TIME) AS NEWEST_PROCESS,
MAX(GREATEST(PV.ITEM_VALUE_1, PV.ITEM_VALUE_2, PV.ITEM_VALUE_3)) AS MAX_ITEM_VALUE,
MIN(LEAST(PV.ITEM_VALUE_1, PV.ITEM_VALUE_2, PV.ITEM_VALUE_3)) AS MIN_ITEM_VALUE
FROM PROCESS P
JOIN PROCESS_VALUES PV ON P._ID = PV.ID_PROCESS
GROUP BY PV.ID_ITEM, PV.ITEM_LIFE;
But I have no idea how I can add to this query the client and process IDs of the oldest and newest process without having to add these same columns in the group by clause. If I add the client ID in the group by, for instance, some items will be duplicated if there is a process for them in different clients.
We can not use MAX or MIN to get the process ID because the returned ID could not match the timestamp collected. Not always the higher ID has the newest timestamp.
For the data provided in the fiddle, this should be the output:
+---------+-----------+-----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+---------------------------------+---------------------------------+
| ITEM ID | ITEM LIFE | TOTAL PROCESSES PER ITEM PER LIFE | OLDEST PROCESS PER ITEM PER LIFE | NEWEST PROCESS PER ITEM PER LIFE | MAX ITEM VALUE PER ITEM PER LIFE | MIN ITEM VALUE PER ITEM PER LIFE | PROCESS ID OF THE OLDEST PROCESS | PROCESS ID OF THE NEWEST PROCESS | CLIENT ID OF THE OLDEST PROCESS | CLIENT ID OF THE NEWEST PROCESS |
+---------+-----------+-----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+---------------------------------+---------------------------------+
| 230 | 1 | 1 | '2018-01-01 10:00:00' | '2018-01-02 10:00:00' | 6.5 | 1.5 | 1 | 2 | 100 | 100 |
| 230 | 2 | 1 | '2018-01-01 10:00:00' | '2018-01-02 10:00:00' | 5.5 | 2.5 | 1 | 2 | 100 | 100 |
| 231 | 1 | 1 | '2018-01-01 10:00:00' | '2018-01-02 10:00:00' | 7.5 | 1.5 | 1 | 2 | 100 | 100 |
| 231 | 2 | 1 | '2018-01-01 10:00:00' | '2018-01-02 10:00:00' | 10.8 | 4.5 | 1 | 2 | 100 | 100 |
| 232 | 1 | 1 | '2018-01-01 10:00:00' | '2018-01-02 10:00:00' | 5.6 | 0.5 | 1 | 2 | 100 | 100 |
| 232 | 2 | 1 | '2018-01-01 10:00:00' | '2018-01-02 10:00:00' | 2.5 | 25.5 | 1 | 2 | 100 | 100 |
| 530 | 1 | 2 | '2018-01-05 13:00:00' | '2018-01-06 13:00:00' | 11.5 | 1.5 | 4 | 3 | 400 | 300 |
| 531 | 1 | 2 | '2018-01-05 13:00:00' | '2018-01-06 13:00:00' | 9.5 | 1.5 | 4 | 3 | 400 | 300 |
| 532 | 1 | 2 | '2018-01-05 13:00:00' | '2018-01-06 13:00:00' | 13.5 | 4.5 | 4 | 3 | 400 | 300 |
+---------+-----------+-----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+----------------------------------+---------------------------------+---------------------------------+
How can I add more columns to the query without the need of put them in the group by?
We use PostgreSQL.
Answer. No, you can GROUP BY a column that was not included in the SELECT statement.
Having can be used without groupby clause,in aggregate function,in that case it behaves like where clause. groupby can be used without having clause with the select statement. 3. The having clause can contain aggregate functions.
It looks like a common problem top-n-per-group
with a slight twist.
Here is one way to do it, where I use the ROW_NUMBER
method. Another method is to use lateral join. Which method is faster depends on the data distribution.
You can read through very detailed answers on dba.se Retrieving n rows per group. (That question is for SQL Server, but Postgres has all the same features, so all answers there apply to Postgres).
SQL Fiddle
I adjusted your sample data and made all ID_CLIENT
different to be able to see them in the results.
CREATE TABLE PROCESS (
_ID BIGINT,
DATE_TIME TIMESTAMP WITH TIME ZONE,
ID_CLIENT BIGINT
);
CREATE TABLE PROCESS_VALUES (
ID_PROCESS BIGINT,
ID_ITEM BIGINT,
ITEM_LIFE INTEGER,
ITEM_VALUE_1 REAL,
ITEM_VALUE_2 REAL,
ITEM_VALUE_3 REAL
);
INSERT INTO PROCESS VALUES(1, '2018-01-01 10:00:00', 100);
INSERT INTO PROCESS VALUES(2, '2018-01-02 10:00:00', 200);
INSERT INTO PROCESS VALUES(3, '2018-01-06 13:00:00', 300);
INSERT INTO PROCESS VALUES(4, '2018-01-05 13:00:00', 400);
INSERT INTO PROCESS_VALUES VALUES(1, 230, 1, 5.5, 6.5, 1.5);
INSERT INTO PROCESS_VALUES VALUES(1, 231, 1, 1.5, 7.5, 3.5);
INSERT INTO PROCESS_VALUES VALUES(1, 232, 1, 5.6, 3.5, 0.5);
INSERT INTO PROCESS_VALUES VALUES(2, 230, 2, 5.5, 2.5, 4.5);
INSERT INTO PROCESS_VALUES VALUES(2, 231, 2, 10.8, 6.5, 4.5);
INSERT INTO PROCESS_VALUES VALUES(2, 232, 2, 25.5, 6.5, 2.5);
INSERT INTO PROCESS_VALUES VALUES(3, 530, 1, 1.5, 6.5, 8.5);
INSERT INTO PROCESS_VALUES VALUES(3, 531, 1, 3.5, 6.5, 1.5);
INSERT INTO PROCESS_VALUES VALUES(3, 532, 1, 6.5, 7.0, 4.5);
INSERT INTO PROCESS_VALUES VALUES(4, 530, 1, 1.5, 11.5, 4.5);
INSERT INTO PROCESS_VALUES VALUES(4, 531, 1, 9.5, 8.5, 1.5);
INSERT INTO PROCESS_VALUES VALUES(4, 532, 1, 5.5, 13.5, 4.5);
Query
I wrote the query using CTEs to make it readable. If you want you can inline them all into one huge query. You should try and test what runs faster.
CTE_RN
is the original two tables joined together with two sets of row numbers, one for oldest processes and one for newest (this is what ORDER BY DATE_TIME ASC/DESC
is for); per item, per life (this is what PARTITION BY
is for).
CTE_OLDEST
leaves only oldest processes.
CTE_NEWEST
leaves only newest processes.
CTE_GROUPS
is your query from the question that calculates all summaries.
Final SELECT
joins summaries with info about oldest and newest processes.
WITH
CTE_RN
AS
(
SELECT
PROCESS.DATE_TIME
,PROCESS._ID AS ID_PROCESS
,PROCESS.ID_CLIENT
,PROCESS_VALUES.ID_ITEM
,PROCESS_VALUES.ITEM_LIFE
,PROCESS_VALUES.ITEM_VALUE_1
,PROCESS_VALUES.ITEM_VALUE_2
,PROCESS_VALUES.ITEM_VALUE_3
,ROW_NUMBER() OVER (PARTITION BY PROCESS_VALUES.ID_ITEM, PROCESS_VALUES.ITEM_LIFE
ORDER BY PROCESS.DATE_TIME ASC) AS rn1
,ROW_NUMBER() OVER (PARTITION BY PROCESS_VALUES.ID_ITEM, PROCESS_VALUES.ITEM_LIFE
ORDER BY PROCESS.DATE_TIME DESC) AS rn2
FROM
PROCESS
INNER JOIN PROCESS_VALUES ON PROCESS._ID = PROCESS_VALUES.ID_PROCESS
)
,CTE_OLDEST
AS
(
SELECT
ID_ITEM
,ITEM_LIFE
,ID_PROCESS
,ID_CLIENT
,DATE_TIME
FROM CTE_RN
WHERE rn1 = 1
)
,CTE_NEWEST
AS
(
SELECT
ID_ITEM
,ITEM_LIFE
,ID_PROCESS
,ID_CLIENT
,DATE_TIME
FROM CTE_RN
WHERE rn2 = 1
)
,CTE_GROUPS
AS
(
SELECT
ID_ITEM
,ITEM_LIFE
,COUNT(ID_ITEM) AS TOTAL_ITEM_PROCESS
,MIN(DATE_TIME) AS OLDEST_PROCESS
,MAX(DATE_TIME) AS NEWEST_PROCESS
,MAX(GREATEST(ITEM_VALUE_1, ITEM_VALUE_2, ITEM_VALUE_3)) AS MAX_ITEM_VALUE
,MIN(LEAST(ITEM_VALUE_1, ITEM_VALUE_2, ITEM_VALUE_3)) AS MIN_ITEM_VALUE
FROM CTE_RN
GROUP BY
ID_ITEM, ITEM_LIFE
)
SELECT
CTE_GROUPS.ID_ITEM
,CTE_GROUPS.ITEM_LIFE
,CTE_GROUPS.TOTAL_ITEM_PROCESS
,CTE_GROUPS.MAX_ITEM_VALUE
,CTE_GROUPS.MIN_ITEM_VALUE
,CTE_OLDEST.DATE_TIME AS OLDEST_DATE_TIME
,CTE_OLDEST.ID_PROCESS AS OLDEST_ID_PROCESS
,CTE_OLDEST.ID_CLIENT AS OLDEST_ID_CLIENT
,CTE_NEWEST.DATE_TIME AS NEWEST_DATE_TIME
,CTE_NEWEST.ID_PROCESS AS NEWEST_ID_PROCESS
,CTE_NEWEST.ID_CLIENT AS NEWEST_ID_CLIENT
FROM
CTE_GROUPS
INNER JOIN CTE_OLDEST
ON CTE_OLDEST.ID_ITEM = CTE_GROUPS.ID_ITEM
AND CTE_OLDEST.ITEM_LIFE = CTE_GROUPS.ITEM_LIFE
INNER JOIN CTE_NEWEST
ON CTE_NEWEST.ID_ITEM = CTE_GROUPS.ID_ITEM
AND CTE_NEWEST.ITEM_LIFE = CTE_GROUPS.ITEM_LIFE
ORDER BY
ID_ITEM, ITEM_LIFE
Results:
| id_item | item_life | total_item_process | max_item_value | min_item_value | oldest_date_time | oldest_id_process | oldest_id_client | newest_date_time | newest_id_process | newest_id_client |
|---------|-----------|--------------------|----------------|----------------|----------------------|-------------------|------------------|----------------------|-------------------|------------------|
| 230 | 1 | 1 | 6.5 | 1.5 | 2018-01-01T10:00:00Z | 1 | 100 | 2018-01-01T10:00:00Z | 1 | 100 |
| 230 | 2 | 1 | 5.5 | 2.5 | 2018-01-02T10:00:00Z | 2 | 200 | 2018-01-02T10:00:00Z | 2 | 200 |
| 231 | 1 | 1 | 7.5 | 1.5 | 2018-01-01T10:00:00Z | 1 | 100 | 2018-01-01T10:00:00Z | 1 | 100 |
| 231 | 2 | 1 | 10.8 | 4.5 | 2018-01-02T10:00:00Z | 2 | 200 | 2018-01-02T10:00:00Z | 2 | 200 |
| 232 | 1 | 1 | 5.6 | 0.5 | 2018-01-01T10:00:00Z | 1 | 100 | 2018-01-01T10:00:00Z | 1 | 100 |
| 232 | 2 | 1 | 25.5 | 2.5 | 2018-01-02T10:00:00Z | 2 | 200 | 2018-01-02T10:00:00Z | 2 | 200 |
| 530 | 1 | 2 | 11.5 | 1.5 | 2018-01-05T13:00:00Z | 4 | 400 | 2018-01-06T13:00:00Z | 3 | 300 |
| 531 | 1 | 2 | 9.5 | 1.5 | 2018-01-05T13:00:00Z | 4 | 400 | 2018-01-06T13:00:00Z | 3 | 300 |
| 532 | 1 | 2 | 13.5 | 4.5 | 2018-01-05T13:00:00Z | 4 | 400 | 2018-01-06T13:00:00Z | 3 | 300 |
The result doesn't match your expected result precisely, but I believe there are typos in the question. You said you want oldest and newest "per item per life", and items 230, 231, 232 have only one process per life, so their oldest and newest processes would be the same.
As you can see in the results of my query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With