This question is an extended and simplified version of this question.
I have been trying to solve in SQL the following iterative equation:
U^{F,D}_{t,p} = (\sum_{D} U^{F,D}_{t-1,p} + C_{t-1,p} )*R^{F,D}_{t-1,p}
that results in:
The closest analogy I can think of is that U^{F,D}_{t,p}
is a number of cars of brand F
, of certain color (D
), that a car dealer (p
) has available at time t
. So the above equation basically says: take the cars' units from the day before t-1
(i.e., U^{F,D}_{t-1,p}
), sum over colors (\sum_{D}
), then add to the sum a C
value from the day before (C_{t-1,p}
, whatever that is), and multiply by some other number R
from the day before (R^{F,D}_{t-1,p}
, whatever that is too).
I have managed to solve a simplified form of the above equation, namely:
i.e., without the sum over cars' colors (D
). The sample data, and the SQL query are in the fiddle that I link, but I paste it here for reference as well:
FULL DATA:
CREATE TABLE DYNAMICS ( T DATE, T_M1 DATE, P INTEGER, F VARCHAR(255), DELTA_F VARCHAR(255), R_T_M1 NUMBER, C_T_M1 NUMBER, U_T_M1 NUMBER, R_T NUMBER, C_T NUMBER, U_T NUMBER );
-- DAY 1, P_1
INSERT INTO DYNAMICS VALUES ( TO_DATE('01.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('31.12.2014','DD.MM.YYYY HH24:MI:SS'), 1,'BMW','RED', 0.5, 0.6, NULL, 0.7,0.8,100.0 );
INSERT INTO DYNAMICS VALUES ( TO_DATE('01.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('31.12.2014','DD.MM.YYYY HH24:MI:SS'), 1,'MERCEDES','RED', 0.5, 0.6, NULL, 0.7,0.8,50.0 );
-- DAY 1, P_2
INSERT INTO DYNAMICS VALUES ( TO_DATE('01.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('31.12.2014','DD.MM.YYYY HH24:MI:SS'), 2,'BMW','RED', 0.5, 0.6, NULL, 0.7,0.8,10.0 );
INSERT INTO DYNAMICS VALUES ( TO_DATE('01.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('31.12.2014','DD.MM.YYYY HH24:MI:SS'), 2,'MERCEDES','RED', 0.5, 0.6, NULL, 0.7,0.8,5.0 );
-- DAY 2, P_1
INSERT INTO DYNAMICS VALUES ( TO_DATE('02.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('01.01.2015','DD.MM.YYYY HH24:MI:SS'), 1,'BMW','RED', 0.7, 0.8, 100, 0.9,0.9, NULL );
INSERT INTO DYNAMICS VALUES ( TO_DATE('02.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('01.01.2015','DD.MM.YYYY HH24:MI:SS'), 1,'MERCEDES','RED', 0.7, 0.8, 50, 0.6,0.5, NULL );
-- DAY 2, P_2
INSERT INTO DYNAMICS VALUES ( TO_DATE('02.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('01.01.2015','DD.MM.YYYY HH24:MI:SS'), 2,'BMW','RED', 0.7, 0.8, 10, 0.7,0.8, NULL );
INSERT INTO DYNAMICS VALUES ( TO_DATE('02.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('01.01.2015','DD.MM.YYYY HH24:MI:SS'), 2,'MERCEDES','RED', 0.7, 0.8, 5, 0.3,0.3, NULL );
-- DAY 3, P_1
INSERT INTO DYNAMICS VALUES ( TO_DATE('03.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('02.01.2015','DD.MM.YYYY HH24:MI:SS'), 1,'BMW','RED', 0.9, 0.9, NULL, 0.2,0.3, NULL );
INSERT INTO DYNAMICS VALUES ( TO_DATE('03.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('02.01.2015','DD.MM.YYYY HH24:MI:SS'), 1,'MERCEDES','RED', 0.6, 0.5, NULL, 1.7,1.8, NULL );
-- DAY 3, P_2
INSERT INTO DYNAMICS VALUES ( TO_DATE('03.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('02.01.2015','DD.MM.YYYY HH24:MI:SS'), 2,'BMW','RED', 0.7, 0.8, NULL, 0.2,0.3, NULL );
INSERT INTO DYNAMICS VALUES ( TO_DATE('03.01.2015','DD.MM.YYYY HH24:MI:SS'), TO_DATE('02.01.2015','DD.MM.YYYY HH24:MI:SS'), 2,'MERCEDES','RED', 0.3, 0.3, NULL, 0.8,0.9, NULL );
SAMPLE DATA:
The following demonstrates example data for car dealer p=1
, car model F=BMW
of color D=RED
(D
from the mathematical equation is called DELTA
in SQL). The initial condition (t=0
) is here 2015-01-01. For all the days t
, all the parameters at t
(R_T, C_T
), and t-1
(R_T_M1, C_T_M1
) are given. Knowing them, the task is to compute cars' units for all the days t > t=0
.
| T | T_M1 | P | F | DELTA_F | R_T_M1 | C_T_M1 | U_T_M1 | R_T | C_T | U_T |
|---------------------------|----------------------------|---|-----|---------|--------|--------|--------|-----|-----|--------|
| January, 01 2015 00:00:00 | December, 31 2014 00:00:00 | 1 | BMW | RED | 0.5 | 0.6 | (null) | 0.7 | 0.8 | 100 |
| January, 02 2015 00:00:00 | January, 01 2015 00:00:00 | 1 | BMW | RED | 0.7 | 0.8 | 100 | 0.9 | 0.9 | (null) |
| January, 03 2015 00:00:00 | January, 02 2015 00:00:00 | 1 | BMW | RED | 0.9 | 0.9 | (null) | 0.2 | 0.3 | (null) |
QUERY:
In order to resolve the simplified problem, I have come up with the query in the linked fiddle that I paste here as well for reference:
--
-- SQL
-- T -> t
-- T_M1 -> t-1
--
WITH RECU( T, T_M1, P, F, DELTA_F,
R_T_M1, C_T_M1, U_T_M1,
R_T, C_T, U_T ) AS (
-- Anchor member.
SELECT T, T_M1, P, F, DELTA_F,
R_T_M1, C_T_M1,
U_T_M1,
R_T, C_T,
U_T
FROM DYNAMICS
-- Initial condition: U_{t-1} does not exist, and U_{t=0} is given
WHERE ( U_T_M1 IS NULL AND U_T IS NOT NULL )
UNION ALL
-- Recursive member.
SELECT NEW.T, NEW.T_M1, NEW.P, NEW.F, NEW.DELTA_F,
NEW.R_T_M1, NEW.C_T_M1,
RECU.U_T AS U_T_M1,
NEW.R_T, NEW.C_T,
-- Here the magic happens, i.e., (U_{t-1} + C_{t-1})*R_{t-1} = U_{t}
(RECU.U_T+NEW.C_T_M1)*NEW.R_T_M1 AS U_T
FROM DYNAMICS NEW
INNER JOIN RECU
ON
-- Translates: yesterday (t-1) of the new record equals today (t) of the parent record
NEW.T_M1 = RECU.T AND
NEW.P = RECU.P AND
NEW.F = RECU.F AND
NEW.DELTA_F = RECU.DELTA_F
)
SELECT * FROM RECU ORDER BY P, F, T;
This query, for the example data pasted above, results in:
| T | T_M1 | P | F | DELTA_F | R_T_M1 | C_T_M1 | U_T_M1 | R_T | C_T | U_T |
|---------------------------|----------------------------|---|-----|---------|--------|--------|--------|-----|-----|--------|
| January, 01 2015 00:00:00 | December, 31 2014 00:00:00 | 1 | BMW | RED | 0.5 | 0.6 | (null) | 0.7 | 0.8 | 100 |
| January, 02 2015 00:00:00 | January, 01 2015 00:00:00 | 1 | BMW | RED | 0.7 | 0.8 | 100 | 0.9 | 0.9 | 70.56 |
| January, 03 2015 00:00:00 | January, 02 2015 00:00:00 | 1 | BMW | RED | 0.9 | 0.9 | 70.56 | 0.2 | 0.3 | 64.314 |
Which works well, i.e., for: 2015-01-02, U_t = (100+0.8)*0.7 = 70.56
, 2015-01-03, U_t = (70.56+0.9)*0.9 = 64.314
.
The query is written in such a way that it works with different car dealers, and different car brands, which can be checked running the query in the linked fiddle
The query above cannot handle correctly the sum over cars' colors from the original equation:
This was irrelevant in the simplified data, since all cars (BMW and MERCEDES) occur there only in RED, and so the sum over colors effectively vanishes.
Such full logic should be probably implemented via a GROUP BY/SUM
expression built in into the original query above. Unfortunately, I do not know how to do it.
So, imagine you have data in the shape like in the simplified problem section, but now every car brand exists in two colors, e.g., like in this linked fiddle:
| T | T_M1 | P | F | DELTA_F | R_T_M1 | C_T_M1 | U_T_M1 | R_T | C_T | U_T |
|---------------------------|----------------------------|---|----------|---------|--------|--------|--------|-----|-----|--------|
| January, 01 2015 00:00:00 | December, 31 2014 00:00:00 | 2 | MERCEDES | BLACK | 0.2 | 0.6 | (null) | 0.5 | 0.8 | 5.5 |
| January, 01 2015 00:00:00 | December, 31 2014 00:00:00 | 2 | MERCEDES | RED | 0.5 | 0.6 | (null) | 0.7 | 0.8 | 5 |
| January, 02 2015 00:00:00 | January, 01 2015 00:00:00 | 2 | MERCEDES | BLACK | 0.5 | 0.8 | 5.5 | 1.3 | 0.5 | (null) |
| January, 02 2015 00:00:00 | January, 01 2015 00:00:00 | 2 | MERCEDES | RED | 0.7 | 0.8 | 5 | 4.3 | 0.5 | (null) |
| January, 03 2015 00:00:00 | January, 02 2015 00:00:00 | 2 | MERCEDES | BLACK | 1.3 | 0.5 | (null) | 0.3 | 0.9 | (null) |
| January, 03 2015 00:00:00 | January, 02 2015 00:00:00 | 2 | MERCEDES | RED | 4.3 | 0.5 | (null) | 0.4 | 0.9 | (null) |
Given such data, you would expect for the dealer p=2
F=MERCEDES
cars dynamics to look as follows:
U^{MERCEDES,BLACK}_{T=2015-01-02,P=2} = ( (5.5 + 5) + 0.8 )*0.5 = 11.3*0.5 = 5.65
U^{MERCEDES,RED}_{T=2015-01-02,P=2} = ( (5.5 + 5) + 0.8 )*0.7 = 11.3*0.7 = 7.91
U^{MERCEDES,BLACK}_{T=2015-01-03,P=2} = ( (5.65 + 7.91) + 0.5 )*1.3 = 14.06*1.3 = 18.278
U^{MERCEDES,RED}_{T=2015-01-03,P=2} = ( (5.65 + 7.91) + 0.5 )*4.3 = 14.06*4.3 = 60.458
Question is how the simplified query above should be adjusted to solve this problem.
A recursive query is one that is defined by a Union All with an initialization fullselect that seeds the recursion. The iterative fullselect contains a direct reference to itself in the FROM clause. There are additional restrictions as to what can be specified in the definition of a recursive query.
An aggregate function in SQL performs a calculation on multiple values and returns a single value. SQL provides many aggregate functions that include avg, count, sum, min, max, etc. An aggregate function ignores NULL values when it performs the calculation, except for the count function.
SQL aggregation is the task of collecting a set of values to return a single value. It is done with the help of aggregate functions, such as SUM, COUNT, and AVG. For example, in a database of products, you might want to calculate the average price of the whole inventory.
I do not think this is the best answer but I think it gives you the result that you are looking for.
WITH RECU( T, T_M1, P, F, DELTA_F,
R_T_M1, C_T_M1, U_T_M1,
R_T, C_T, U_T ) AS (
-- Anchor member.
SELECT T, T_M1, P, F, DELTA_F,
R_T_M1, C_T_M1,
U_T_M1,
R_T, C_T,
-- Start SUM of u_t
(select sum(u_t) from DYNAMICS d2
where d2.T=d1.T and d2.T_M1=d1.T_M1 and d2.P=d1.P and d2.F=d1.F
group by T, T_M1, P, F) as u_t
-- End SUM of u_t
FROM DYNAMICS d1
-- Initial condition: U_{t-1} does not exist, and U_{t=0} is given
WHERE ( U_T_M1 IS NULL AND U_T IS NOT NULL )
UNION ALL
-- Recursive member.
SELECT NEW.T, NEW.T_M1, NEW.P, NEW.F, NEW.DELTA_F,
NEW.R_T_M1, NEW.C_T_M1,
RECU.U_T AS U_T_M1,
NEW.R_T, NEW.C_T
,
-- Here the magic happens, i.e., (U_{t-1} + C_{t-1})*R_{t-1} = U_{t}
(
RECU.U_T
+NEW.C_T_M1)*NEW.R_T_M1 AS U_T
FROM DYNAMICS NEW
INNER JOIN RECU
ON
-- Translates: yesterday (t-1) of the new record equals today (t) of the parent record
NEW.T_M1 = RECU.T AND
NEW.P = RECU.P AND
NEW.F = RECU.F AND
NEW.DELTA_F = RECU.DELTA_F
)
SELECT * FROM RECU ORDER BY P, F, T;
What i've added is between Start SUM of u_t
and End SUM of u_t
comments and here is the fiddle.
The solution turned out to be easier than I had thought (although I have spent one day trying all sorts of things, and now all seems to be trivial).
The query working (tested) on the original fiddle data reads:
WITH RECU( T, T_M1, P, F, DELTA_F,
R_T_M1, C_T_M1, U_T_M1,
R_T, C_T, U_T ) AS (
-- Anchor member.
SELECT T, T_M1, P, F, DELTA_F,
R_T_M1, C_T_M1,
U_T_M1,
R_T, C_T,
U_T
FROM DYNAMICS
-- Initial condition: U_{t-1} does not exist, and U_{t=0} is given
WHERE ( U_T_M1 IS NULL AND U_T IS NOT NULL )
UNION ALL
-- Recursive member.
SELECT NEW.T, NEW.T_M1, NEW.P, NEW.F, NEW.DELTA_F,
NEW.R_T_M1, NEW.C_T_M1,
RECU.U_T AS U_T_M1,
NEW.R_T, NEW.C_T,
-- Here the magic happens, i.e., (U_{t-1} + C_{t-1})*R_{t-1} = U_{t}
( (( SUM(RECU.U_T) OVER (PARTITION BY NEW.T, NEW.T_M1, NEW.P, NEW.F) ) + NEW.C_T_M1)*NEW.R_T_M1 ) AS U_T
FROM DYNAMICS NEW
INNER JOIN RECU
ON
-- Translates: yesterday (t-1) of the new record equals today (t) of the parent record
NEW.T_M1 = RECU.T AND
NEW.P = RECU.P AND
NEW.F = RECU.F AND
NEW.DELTA_F = RECU.DELTA_F
)
SELECT * FROM RECU
ORDER BY P, F, T, DELTA_F;
Which is a minimal change to the original query (only one line of the original query affected), and uses ORACLE analytic function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With