Background
I have a Java application which periodically takes a resultset from one database (MySQL), and attempts to find matches in another database (Postgres / PostGIS to be specific).
Problem
Presently, the application queries the Postgres database once for ever MySQL record in the result set (could be upwards of tens of thousands). I'm attempting to alter the algorithm so that the application generates one single query that yields multiple results if any matches are found. Another way to describe my goal is that the query should behave similar to a typical JOIN, if these two tables existed within the same database system.
Current Solution
In order to solve this, I'm creating a virtual table in the FROM clause. However, the only way I know how to do this from a list of values is by writing individual SELECT statements joined with a UNION. The result appears to work, and while I haven't tested for performance with thousands of records, it doesn't appear to have any massive impact using hundreds of such SELECT-UNION statements. This is the relevant portion of the overall query to illustrate what I have done so far:
SELECT *, ST_Distance_Sphere(latlng, geom) as distance
FROM rwis_sites
INNER JOIN
(SELECT 1100 as unit_id, ST_GeomFromText('POINT(-81.19701 32.09279)', 4326) as geom UNION
SELECT 1100 as unit_id, ST_GeomFromText('POINT(-81.19682 32.09224)', 4326) as geom
UNION
SELECT 1100 as unit_id, ST_GeomFromText('POINT(-81.1968 32.09213)', 4326) as geom
UNION
... just a few more...hundred...thousand...
SELECT 2266 as unit_id, ST_GeomFromText('POINT(-97.98719 29.57656)', 4326) as geom
UNION
SELECT 2266 as unit_id, ST_GeomFromText('POINT(-97.98815 29.57602)', 4326) as geom
) virtualTable
ON ST_Distance_Sphere(latlng, geom) < 10000
ORDER BY ST_Distance_Sphere(latlng, geom) ASC limit 1
Since the "virtual table" is generated programmatically, there is little effort on my part.
Question
However, I am concerned whether this is a "goofy" approach (not to mention any performance issues I've yet to detect), and ultimately I am wondering: is there is a better way of creating something similar without thousands of SELECT-UNION statements?
Here is a better way to create a table of values on the fly
select *
from (
values
(1100::int, 'POINT(-81.19701 32.09279)'::geography(Point)),
(1100::int, 'POINT(-81.19682 32.09224)'::geography(Point))
) as t(unit_id, geom)
Yet even better idea is perhaps to use foreign data wrapper to bring your mysql table in PG.
EDIT
You might want to try to pre-screen records from your main table provided it is indexed with ST_Dwithin(latlng, geom, 0.1) provided that 0.1° is just slightly more (mind cos(32.09)) than 10000 m.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With