I read similar questions and it's possible to make it work by using window functions, however, since ClickHouse does not seem to support them I'm seeking for an alternative solution.
Given the time intervals like (1, 5), (2, 3), (3, 8), (10, 15) I want to "merge" overlapping intervals into single ones. In my example, it would be:
(1, 8) and (10, 15).
Any pointers are appreciated! Thanks!
This task would have easily been resolved by arrayReduce If this function had been worked with arbitrary lambda. While this is not there, try to solve the problem by available means.
SELECT
intervals,
arraySort(x -> x, intervals) sortedIntervals,
/* try to merge each interval with precede ones */
arrayMap((x, index) -> index != 1
? (arrayReduce(
'min',
arrayMap(
i -> sortedIntervals[i + 1].1,
/* get indexes of intervals that can be merged with the current one (index is zero-based) */
arrayFilter(
i -> x.1 <= sortedIntervals[i + 1].2 AND x.2 >= sortedIntervals[i + 1].1,
range(index)))),
arrayReduce(
'max',
arrayMap(
i -> sortedIntervals[i + 1].2,
/* get indexes of intervals that can be merged with the current one (index is zero-based) */
arrayFilter(
i -> x.1 <= sortedIntervals[i + 1].2 AND x.2 >= sortedIntervals[i + 1].1,
range(index)))))
: x,
sortedIntervals,
arrayEnumerate(sortedIntervals)) rawResult,
/* filter out intervals nested to other ones */
arrayFilter(
(x, index) -> index == length(rawResult) OR x.1 != rawResult[index + 1].1,
rawResult,
arrayEnumerate(rawResult)) result
FROM
(
SELECT [(1, 5), (2, 3), (3, 8), (10, 15)] intervals
UNION ALL
SELECT [(2, 4), (1, 3), (3, 6), (12, 14), (7, 7), (13, 16), (9, 9), (8, 9), (10, 15)]
UNION ALL
SELECT [(20, 22), (18, 18), (16, 21), (1, 8), (2, 9), (3, 5), (10, 12), (11, 13), (14, 15)]
UNION ALL
SELECT []
UNION ALL
SELECT [(1, 11)]
)
FORMAT Vertical;
/*
Row 1:
──────
intervals: [(2,4),(1,3),(3,6),(12,14),(7,7),(13,16),(9,9),(8,9),(10,15)]
sortedIntervals: [(1,3),(2,4),(3,6),(7,7),(8,9),(9,9),(10,15),(12,14),(13,16)]
rawResult: [(1,3),(1,4),(1,6),(7,7),(8,9),(8,9),(10,15),(10,15),(10,16)]
result: [(1,6),(7,7),(8,9),(10,16)]
Row 2:
──────
intervals: [(1,5),(2,3),(3,8),(10,15)]
sortedIntervals: [(1,5),(2,3),(3,8),(10,15)]
rawResult: [(1,5),(1,5),(1,8),(10,15)]
result: [(1,8),(10,15)]
Row 3:
──────
intervals: [(20,22),(18,18),(16,21),(1,8),(2,9),(3,5),(10,12),(11,13),(14,15)]
sortedIntervals: [(1,8),(2,9),(3,5),(10,12),(11,13),(14,15),(16,21),(18,18),(20,22)]
rawResult: [(1,8),(1,9),(1,9),(10,12),(10,13),(14,15),(16,21),(16,21),(16,22)]
result: [(1,9),(10,13),(14,15),(16,22)]
Row 4:
──────
intervals: []
sortedIntervals: []
rawResult: []
result: []
Row 5:
──────
intervals: [(1,11)]
sortedIntervals: [(1,11)]
rawResult: [(1,11)]
result: [(1,11)]
*/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With