Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Joins on spatial mysql indexes

I have two tables: one with points, the other with polys.

CREATE TABLE `points` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `point` point NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM;

CREATE TABLE `ranges` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `poly` polygon NOT NULL,
  PRIMARY KEY (`id`),
  SPATIAL KEY `poly` (`poly`)
) ENGINE=MyISAM;

I want to join ranges to points on points inside polys. Queries look simple:

SELECT * 
  FROM points 
    LEFT JOIN ranges 
      ON MBRCONTAINS(poly, point) 
  WHERE points.id = 2;

This query works fast and uses indexes, part of explain:

table  | type  | possible_keys | key  | key_len
ranges | range | poly          | poly | 34

But, when I try to join with several rows from table points:

SELECT * 
  FROM points 
   LEFT JOIN ranges 
    ON MBRCONTAINS(poly, point) 
  WHERE points.id IN (1,2,3);

everything breaks down:

+----+-------------+------------+-------+---------------+---------+---------+------+--------+-------------+
| id | select_type | table      | type  | possible_keys | key     | key_len | ref  | rows   | Extra       |
+----+-------------+------------+-------+---------------+---------+---------+------+--------+-------------+
|  1 | SIMPLE      | points     | range | PRIMARY       | PRIMARY | 4       | NULL |      3 | Using where |
|  1 | SIMPLE      | ranges     | ALL   | poly          | NULL    | NULL    | NULL | 155183 |             |
+----+-------------+------------+-------+---------------+---------+---------+------+--------+-------------+

Adding FORCE INDEX (poly) does not help.

Sample data to test queries (sorry, only php version, I'm not common with SQL procedures):

//points
for($i=0;$i<=500;$i++) {
    $point = mt_rand();
    mysql_query('INSERT INTO points (point) VALUES (POINTFROMWKB(POINT('.$point.', 0)))');
}

$qty = 20000;
$max = mt_getrandmax();
$add = $max / $qty
$end = 0;

//polys
while($end < $max) {
    $start = $end;
    $end = mt_rand($start, $start + $add);
    mysql_query('INSERT INTO ranges (poly) VALUES (
        GEOMFROMWKB(POLYGON(LINESTRING(
            POINT('.$start.', -1),
            POINT('.$end.',   -1),
            POINT('.$end.',    1),
            POINT('.$start.',  1),
            POINT('.$start.', -1)
          )))
    )');
}
like image 286
Oroboros102 Avatar asked Nov 23 '11 15:11

Oroboros102


2 Answers

I believe that it's because MySQL doesn't support merging spatial indexes. Not sure if it's still true but I've read it somewhere in the past. If you have an OR statement, then the spatial indexes are not used

In your case, where are you doing points.id = 1, that's a straight select with one result returned that gets used in the mbrcontains. That uses the index.

When you add points.in (1,2,3), that returns 3 results and each needs to be mapped to the ranges table, therefore not working

result

id  select_type     table   type    possible_keys   key     key_len     ref     rows    filtered    Extra
1   SIMPLE  points  range   PRIMARY     PRIMARY     4   NULL    3   100.00  Using where
1   SIMPLE  ranges  ALL     poly    NULL    NULL    NULL    6467418     100.00   

You can simplify your test without the the point table by doing this: SELECT * FROM ranges where mbrcontains( poly, GEOMFROMWKB(POINT(0, 0)))

id  select_type     table   type    possible_keys   key     key_len     ref     rows    filtered    Extra
1   SIMPLE  ranges  range   poly    poly    34  NULL    1   100.00  Using where

And now this; SELECT * FROM ranges where mbrcontains( poly, GEOMFROMWKB(POINT(0, 0))) OR mbrcontains( poly, GEOMFROMWKB(POINT(10, 10)))

result

id  select_type     table   type    possible_keys   key     key_len     ref     rows    filtered    Extra
1   SIMPLE  ranges  ALL     poly    NULL    NULL    NULL    6467418     100.00  Using where

See that in the second case, you are not using index and just scanning.

You could force the query to use index by creating UNION for each specific point but I am not sure if that's going to be faster. I did some tests locally and it was a bit slower than your first query.

EXPLAIN EXTENDED 
SELECT *
FROM points
FORCE INDEX (PRIMARY )
LEFT JOIN ranges
FORCE INDEX ( poly ) ON mbrcontains( poly, point )
WHERE points.id = 1
UNION DISTINCT
SELECT *
FROM points
FORCE INDEX (PRIMARY )
LEFT JOIN ranges
FORCE INDEX ( poly ) ON mbrcontains( poly, point )
WHERE points.id = 2
UNION DISTINCT
SELECT *
FROM points
FORCE INDEX (PRIMARY )
LEFT JOIN ranges
FORCE INDEX ( poly ) ON mbrcontains( poly, point )
WHERE points.id = 3

result

id  select_type     table   type    possible_keys   key     key_len     ref     rows    filtered    Extra
1   PRIMARY     points  const   PRIMARY     PRIMARY     4   const   1   100.00   
1   PRIMARY     ranges  range   poly    poly    34  NULL    1   100.00  Using where
2   UNION   points  const   PRIMARY     PRIMARY     4   const   1   100.00   
2   UNION   ranges  range   poly    poly    34  NULL    1   100.00  Using where
3   UNION   points  const   PRIMARY     PRIMARY     4   const   1   100.00   
3   UNION   ranges  range   poly    poly    34  NULL    1   100.00  Using where
NULL    UNION RESULT    <union1,2,3>    ALL     NULL    NULL    NULL    NULL    NULL    NULL     
like image 99
Alexey Gerasimov Avatar answered Nov 05 '22 03:11

Alexey Gerasimov


I have used successfully similar queries, with only one difference in data model: a spatial key on the points database. In my case:

CREATE TABLE geopoints (
  pid int(11) NOT NULL AUTO_INCREMENT,
  description varchar(255) NOT NULL DEFAULT '',
  geopoint point NOT NULL,
  PRIMARY KEY (pid),
  SPATIAL KEY geopoint (geopoint) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

And everything went well in queries like this:

SELECT pt.pid, x(geopoint), Y(geopoint), pl.pid, AsText(geopolygon) 
  FROM geopoints pt INNER JOIN geopolygons pl ON MBRCONTAINS(geopolygon, geopoint)
 WHERE pt.pid IN (1,2,4,5) AND pl.pid BETWEEN 1 AND 5;

my two cents,

like image 39
Athanassios Stavrakoudis Avatar answered Nov 05 '22 03:11

Athanassios Stavrakoudis