I have a legacy innodb table Listing containing a business with latitude/longitude. Given an input latitude/longitude (below 51.2167/4.41667), the query is to return the first active, enabled, not-deleted 30 businesses in order of proximity (kilometers). A join with the accounts table is made to check the validity of the listing.
select
listing.*
from
listing listing ,
account account
where
listing.account_id = account.id
and listing.active = 1
and listing.approved = 1
and listing.deleted = 0
and listing.enabled = 1
and account.enabled = 1
and account.activated_by_user = 1
group by
listing.id
having
111.222569*degrees(acos(sin(radians(listing.latitude))*sin(radians( 51.2167)) +cos(radians(listing.latitude))*cos(radians( 51.2167))*cos(radians(listing.longitude - 4.41667)))) < 250
order by
111.222569*degrees(acos(sin(radians(listing.latitude))*sin(radians( 51.2167)) +cos(radians(listing.latitude))*cos(radians( 51.2167))*cos(radians(listing.longitude - 4.41667))))
limit 30;
The table Listing and Account each contain over 50,000 rows, but the query still takes on average 24sec to run. Without the order by, it takes 17sec.
I've already tried setting some indexes on active, approved, deleted, enabled. Can I rewrite the query or add certain indexes to efficiently perform this query - without changing the table structure?
+----+-------------+---------+-------------+-------------------------------------------------------------------------------------------------+-----------------------------------------------------------------+---------+------------------------+------+--------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------------+-------------------------------------------------------------------------------------------------+-----------------------------------------------------------------+---------+------------------------+------+--------------------------------------------------------------------------------------------------------------------------------+
| 1 | SIMPLE | listing | index_merge | FKB4DC521D9306A80C,listing_active,listing_approved,listing_enabled,listing_deleted,index_test_1 | listing_active,listing_approved,listing_enabled,listing_deleted | 1,1,1,1 | NULL | 3392 | Using intersect(listing_active,listing_approved,listing_enabled,listing_deleted); Using where; Using temporary; Using filesort |
| 1 | SIMPLE | account | eq_ref | PRIMARY,account_enabled,account_activated_by_user,index_test_2 | PRIMARY | 8 | ctm.listing.account_id | 1 | Using where |
+----+-------------+---------+-------------+-------------------------------------------------------------------------------------------------+-----------------------------------------------------------------+---------+------------------------+------+--------------------------------------------------------------------------------------------------------------------------------+
Any help is greatly appreciated.
To find locations in your markers table that are within a certain radius distance of a given latitude/longitude, you can use a SELECT statement based on the Haversine formula. The Haversine formula is used generally for computing great-circle distances between two pairs of coordinates on a sphere.
Storing Latitude & Longitude data as Floats or Decimal This is one of the most fundamental ways of storing geocoordinate data. Latitude & longitude values can be represented & stored in a SQL database using decimal points (Decimal degrees) rather than degrees (or Degrees Minutes Seconds).
SPATIAL INDEX creates an R-tree index. For storage engines that support nonspatial indexing of spatial columns, the engine creates a B-tree index. A B-tree index on spatial values is useful for exact-value lookups, but not for range scans.
This is taking a long time because your query is computing that great-circle-distance formula with all the transcendental functions once for each row in your 50k row table (twice when you include the sort).
Can you limit the distance range of your search? You've probably noticed that most store-finder web apps have a pulldown menu item giving a choice "within 5 miles," "within 10 miles," and so forth.
If you CAN do this, you should add a WHERE clause to your search, and optimize it by putting an index on your LATITUDE column. Let's say you use the value RANGELIMIT for your search range limit, in miles.
Try this clause
WHERE LISTING.LATITUDE BETWEEN LOCATION.LATITUDE - (RANGELIMIT * 1.1508/60)
AND LOCATION.LATITUDE + (RANGELIMIT * 1.1508/60)
This works because a nautical mile is almost exactly equal to one minute (1/60th) of a degree of latitude. The 1.1508 factor converts nautical miles into statute miles.
The clause I suggested will use a latitude index to narrow down the search, and you'll compute the great circle distance a whole lot less frequently.
You could also include a BETWEEN clause on longitude. But in my experience just doing the latitude BETWEEN search gets you excellent results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With