Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle Analytic Question

Given a function zipdistance(zipfrom,zipto) which calculates the distance (in miles) between two zip codes and the following tables:

create table zips_required(
   zip varchar2(5)
);

create table zips_available(
   zip varchar2(5),
   locations number(100)
);

How can I construct a query that will return to me each zip code from the zips_required table and the minimum distance that would produce a sum(locations) >= n.

Up till now we've just run an exhaustive loop querying for each radius until we've met the criteria.

--Do this over and over incrementing the radius until the minimum requirement is met
select count(locations) 
from zips_required zr 
left join zips_available za on (zipdistance(zr.zip,za.zip)< 2) -- Where 2 is the radius

This can take a while on a large list. It feels like this could be done with an oracle analytic query along the lines of:

min() over (
  partition by zips_required.zip 
  order by zipdistance( zips_required.zip, zips_available.zip)
  --range stuff here?
) 

The only analytic queries I have done have been "row_number over (partition by order by)" based, and I'm treading into unknown areas with this. Any guidance on this is greatly appreciated.

like image 611
Josh Bush Avatar asked Jun 23 '09 16:06

Josh Bush


People also ask

What is analytic function in Oracle with example?

Analytical functions are used to do 'analyze' data over multiple rows and return the result in the current row. E.g Analytical functions can be used to find out running totals, ranking the rows, do some aggregation on the previous or forthcoming row etc.

What is the difference between aggregate and analytic functions in Oracle?

1) Analytical function works on each rows of table and return single data whereas Aggregate function works on each rows and returns multiple set of data.

What are analytic queries?

With analytical queries, you can combine data from multiple queries from the same or differing data sources into one result set. In some situations, you might need to draw data from several different sets of data, some of which might be stored in different data sources.


2 Answers

This is what I came up with :

SELECT zr, min_distance
  FROM (SELECT zr, min_distance, cnt, 
               row_number() over(PARTITION BY zr ORDER BY min_distance) rnk
           FROM (SELECT zr.zip zr, zipdistance(zr.zip, za.zip) min_distance,
                         COUNT(za.locations) over(
                             PARTITION BY zr.zip 
                             ORDER BY zipdistance(zr.zip, za.zip)
                         ) cnt
                    FROM zips_required zr
                   CROSS JOIN zips_available za)
          WHERE cnt >= :N)
 WHERE rnk = 1
  1. For each zip_required calculate the distance to the zip_available and sort them by distance
  2. For each zip_required the count with range allows you to know how many zip_availables are in the radius of that distance.
  3. filter (first where COUNT(locations) > N)

I used to create sample data:

INSERT INTO zips_required
   SELECT to_char(10000 + 100 * ROWNUM) FROM dual CONNECT BY LEVEL <= 5;

INSERT INTO zips_available
   (SELECT to_number(zip) + 10 * r, 100 - 10 * r FROM zips_required, (SELECT ROWNUM r FROM dual CONNECT BY LEVEL <= 9));

CREATE OR REPLACE FUNCTION zipdistance(zipfrom VARCHAR2,zipto VARCHAR2) RETURN NUMBER IS
BEGIN
   RETURN abs(to_number(zipfrom) - to_number(zipto));
END zipdistance;
/

Note: you used COUNT(locations) and SUM(locations) in your question, I assumed it was COUNT(locations)

like image 167
Vincent Malgrat Avatar answered Sep 29 '22 22:09

Vincent Malgrat


SELECT  *
FROM    (
        SELECT  zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY rn DESC) AS rn2
        FROM    (
                SELECT  zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY zd DESC) AS rn
                FROM    (
                        SELECT  zr.zip, zipdistance(zr.zip, za.zip) AS zd
                        FROM    zips_required zr
                        JOIN    zips_available za
                        )
                )
        WHERE   rn <= n
        )
WHERE   rn2 = 1

For each zip_required, this will select the minimal distance into which fit N zip_available's, or maximal distance if the number of zip_available's is less than N.

like image 28
Quassnoi Avatar answered Sep 29 '22 22:09

Quassnoi