I am trying to create a SQL so I can make a time series chart in Google Data Studio with connection of BigQuery. You can see my SQL below.
WITH
CTE_1 AS
(SELECT ID, Date, Min_Predict, Max_Predict, Interval
,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Date) AS row_num
FROM
table),
CTE_2 AS
(SELECT Date, Min_Predict, Max_Predict,
SUM(Min_Predict) OVER (ORDER BY Date) AS Min,
SUM(Max_Predict) OVER (ORDER BY Date) AS Max
FROM CTE_1
WHERE
row_num = 1 AND Interval = 'A')
SELECT Date, Min, Max
From CTE_2
GROUP BY Date, Min, Max
ORDER BY Date
I get this table as a result.
Row ProgressDate EstMin EstMax
1 2017-07-21T00:00:00Z 0.125 0.25
2 2017-07-24T00:00:00Z 5.125 5.375
3 2017-07-25T00:00:00Z 8.75 10.25
4 2017-07-26T00:00:00Z 10.0 12.0
5 2017-07-27T00:00:00Z 10.5 12.75
6 2017-08-01T00:00:00Z 15.25 19.125
7 2017-08-02T00:00:00Z 15.5 19.375
8 2017-08-05T00:00:00Z 16.25 20.625
As you can see I have missing dates e.g. between 21.07 and 24.07. How can I fill those missing dates with the data of previous day? Because in data studio, I have missing data on those days which I can equal them too 0 but I don't want this.
SELECT DATE_ADD(DATE "2021-01-01", INTERVAL 2 DAY) AS two_days_later; The DATE_ADD BigQuery gives the following result. +——————–+, in the YYYY-MM-DD format. In the same way, if we use DATE_ADD BigQuery to add 25 days to 7th September 2021, we get 2nd October 2021.
The BigQuery data manipulation language (DML) enables you to update, insert, and delete data from your BigQuery tables. You can execute DML statements just as you would a SELECT statement, with the following conditions: You must use Google Standard SQL.
Use DATETIME_TRUNC function SELECT DATE_TRUNC('2021-05-20', month); Result: 2021-05-01. This function can also be used to get the first day of a quarter or a year, etc. SELECT DATE_TRUNC('2021-05-20', year);
Below is for BigQuery Standard SQL and built off of your current result
#standardSQL
WITH your_current_result AS (
......
), days AS (
SELECT day
FROM (
SELECT
MIN(DATE(TIMESTAMP(ProgressDate))) min_dt,
MAX(DATE(TIMESTAMP(ProgressDate))) max_dt
FROM your_current_result
), UNNEST(GENERATE_DATE_ARRAY(min_dt, max_dt)) day
)
SELECT day,
LAST_VALUE(EstMin IGNORE NULLS) OVER(ORDER BY day) EstMin,
LAST_VALUE(EstMax IGNORE NULLS) OVER(ORDER BY day) EstMax
FROM days
LEFT JOIN your_current_result
ON day = DATE(TIMESTAMP(ProgressDate))
-- ORDER BY day
you can test, play with above using example of output in your question
#standardSQL
WITH your_current_result AS (
SELECT '2017-07-21T00:00:00Z' ProgressDate, 0.125 EstMin, 0.25 EstMax UNION ALL
SELECT '2017-07-24T00:00:00Z', 5.125, 5.375 UNION ALL
SELECT '2017-07-25T00:00:00Z', 8.75, 10.25 UNION ALL
SELECT '2017-07-26T00:00:00Z', 10.0, 12.0 UNION ALL
SELECT '2017-07-27T00:00:00Z', 10.5, 12.75 UNION ALL
SELECT '2017-08-01T00:00:00Z', 15.25, 19.125 UNION ALL
SELECT '2017-08-02T00:00:00Z', 15.5, 19.375 UNION ALL
SELECT '2017-08-05T00:00:00Z', 16.25, 20.625
), days AS (
SELECT day
FROM (
SELECT
MIN(DATE(TIMESTAMP(ProgressDate))) min_dt,
MAX(DATE(TIMESTAMP(ProgressDate))) max_dt
FROM your_current_result
), UNNEST(GENERATE_DATE_ARRAY(min_dt, max_dt)) day
)
SELECT day,
LAST_VALUE(EstMin IGNORE NULLS) OVER(ORDER BY day) EstMin,
LAST_VALUE(EstMax IGNORE NULLS) OVER(ORDER BY day) EstMax
FROM days
LEFT JOIN your_current_result
ON day = DATE(TIMESTAMP(ProgressDate))
ORDER BY day
with result
Row day EstMin EstMax
1 2017-07-21 0.125 0.25
2 2017-07-22 0.125 0.25
3 2017-07-23 0.125 0.25
4 2017-07-24 5.125 5.375
5 2017-07-25 8.75 10.25
6 2017-07-26 10.0 12.0
7 2017-07-27 10.5 12.75
8 2017-07-28 10.5 12.75
9 2017-07-29 10.5 12.75
10 2017-07-30 10.5 12.75
11 2017-07-31 10.5 12.75
12 2017-08-01 15.25 19.125
13 2017-08-02 15.5 19.375
14 2017-08-03 15.5 19.375
15 2017-08-04 15.5 19.375
16 2017-08-05 16.25 20.625
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With