Given a function zipdistance(zipfrom,zipto) which calculates the distance (in miles) between two zip codes and the following tables:
create table zips_required(
zip varchar2(5)
);
create table zips_available(
zip varchar2(5),
locations number(100)
);
How can I construct a query that will return to me each zip code from the zips_required table and the minimum distance that would produce a sum(locations) >= n.
Up till now we've just run an exhaustive loop querying for each radius until we've met the criteria.
--Do this over and over incrementing the radius until the minimum requirement is met
select count(locations)
from zips_required zr
left join zips_available za on (zipdistance(zr.zip,za.zip)< 2) -- Where 2 is the radius
This can take a while on a large list. It feels like this could be done with an oracle analytic query along the lines of:
min() over (
partition by zips_required.zip
order by zipdistance( zips_required.zip, zips_available.zip)
--range stuff here?
)
The only analytic queries I have done have been "row_number over (partition by order by)" based, and I'm treading into unknown areas with this. Any guidance on this is greatly appreciated.
Analytical functions are used to do 'analyze' data over multiple rows and return the result in the current row. E.g Analytical functions can be used to find out running totals, ranking the rows, do some aggregation on the previous or forthcoming row etc.
1) Analytical function works on each rows of table and return single data whereas Aggregate function works on each rows and returns multiple set of data.
With analytical queries, you can combine data from multiple queries from the same or differing data sources into one result set. In some situations, you might need to draw data from several different sets of data, some of which might be stored in different data sources.
This is what I came up with :
SELECT zr, min_distance
FROM (SELECT zr, min_distance, cnt,
row_number() over(PARTITION BY zr ORDER BY min_distance) rnk
FROM (SELECT zr.zip zr, zipdistance(zr.zip, za.zip) min_distance,
COUNT(za.locations) over(
PARTITION BY zr.zip
ORDER BY zipdistance(zr.zip, za.zip)
) cnt
FROM zips_required zr
CROSS JOIN zips_available za)
WHERE cnt >= :N)
WHERE rnk = 1
zip_required
calculate the distance to the zip_available
and sort them by distancezip_required
the count
with range
allows you to know how many zip_availables
are in the radius of that distance.I used to create sample data:
INSERT INTO zips_required
SELECT to_char(10000 + 100 * ROWNUM) FROM dual CONNECT BY LEVEL <= 5;
INSERT INTO zips_available
(SELECT to_number(zip) + 10 * r, 100 - 10 * r FROM zips_required, (SELECT ROWNUM r FROM dual CONNECT BY LEVEL <= 9));
CREATE OR REPLACE FUNCTION zipdistance(zipfrom VARCHAR2,zipto VARCHAR2) RETURN NUMBER IS
BEGIN
RETURN abs(to_number(zipfrom) - to_number(zipto));
END zipdistance;
/
Note: you used COUNT(locations) and SUM(locations) in your question, I assumed it was COUNT(locations)
SELECT *
FROM (
SELECT zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY rn DESC) AS rn2
FROM (
SELECT zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY zd DESC) AS rn
FROM (
SELECT zr.zip, zipdistance(zr.zip, za.zip) AS zd
FROM zips_required zr
JOIN zips_available za
)
)
WHERE rn <= n
)
WHERE rn2 = 1
For each zip_required
, this will select the minimal distance into which fit N
zip_available
's, or maximal distance if the number of zip_available
's is less than N
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With