I have a table with gps data:
lat, lon, timestamp
My queries select by a timestamp window:
SELECT * FROM gps WHERE timesamp >= '2016-12-12T02:00:00Z' AND timestamp <= '2016-12-12T03:00:00Z'
The resultset becomes way to big for larger timespans and I want to limit the size of the resultset. However, I still want the data to cover the entire timespan, but with reduced resolution. So instead of
LIMIT 1000
I want to select every n'th element based on my specified max resultset size and the actual size (count) of the data.
I want something like this
limit = 1000
totalCount = SELECT COUNT(*) FROM gps WHERE timestamp >= ...
resolution = totalCount / limit
SELECT every resolution'th FROM SELECT * FROM gps WHERE timestamp >= ...
Work with ROW_NUMBER()
, which is a "window function":
SELECT s.* FROM (
SELECT t.*,
ROW_NUMBER() OVER(ORDER BY t.timestamp) as rnk,
COUNT(*) OVER() as total_cnt
FROM gps t
WHERE t.timestamp between '2016-12-12T02:00:00Z' AND '2016-12-12T03:00:00Z') s
WHERE MOD(s.rnk,(total_cnt/1000)) = 0
An alternative is to randomize the data and then use limit
:
SELECT *
FROM gp
WHERE timesamp >= '2016-12-12T02:00:00Z' AND timestamp <= '2016-12-12T03:00:00Z'
ORDER BY random()
LIMIT x;
This doesn't guarantee an exact distribution across all timestamps. But, it does make it possible to get exactly 1000 values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With