Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Order By NULL in a UNION

Tags:

sql

mysql

I have a query (see below) that I have a custom developed UDF that is used to calculate whether or not certain points are within a polygon (first query in UNION) or circular (second query in UNION) shape.

select  e.inquiry_match_type_id 
        , a.geo_boundary_id 
        , GeoBoundaryContains(c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, a.lat, a.lon, a.geo_boundary_vertex_id ) in_out 
        , e.inquiry_id 
        , e.external_id 
        , COALESCE(f.inquiry_device_id,0) inquiry_device_id 
        , b.external_info1 
        , b.external_info2 
        , b.geo_boundary_id 
        , b.geo_boundary_type_id 
from    geo_boundary_vertex a 
        join geo_boundary b on b.geo_boundary_id = a.geo_boundary_id 
        join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id 
        join trackpoint_index_mem d on d.trackpoint_index_id = c.trackpoint_index_id 
        join inquiry_mem e on e.inquiry_id = b.inquiry_id left 
        outer join inquiry_device_mem f on f.inquiry_id = e.inquiry_id and f.device_id = 3201 
where   d.trackpoint_index_id = 3127  
        and b.geo_boundary_type_id = 3  
        and e.expiration_date >= now() 
group by 
        a.geo_boundary_id 
UNION 
select  e.inquiry_match_type_id      
        , b.geo_boundary_id      
        , GeoBoundaryContains( c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, b.centroid_lat, b.centoid_lon, b.radius ) in_out      
        , e.inquiry_id      
        , e.external_id      
        , COALESCE(f.inquiry_device_id,0) inquiry_device_id      
        , b.external_info1      
        , b.external_info2      
        , b.geo_boundary_id      
        , b.geo_boundary_type_id 
from    geo_boundary b 
        join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id 
        join trackpoint_index_mem d on d.trackpoint_index_id = c.trackpoint_index_id 
        join inquiry_mem e on e.inquiry_id = b.inquiry_id 
        left outer join inquiry_device_mem f on f.inquiry_id = e.inquiry_id and f.device_id = 3201 
where   d.trackpoint_index_id = 3127  
        and b.geo_boundary_type_id = 2  
        and e.expiration_date >= now() 
group by
        b.geo_boundary_id 

When I run an explain for the query I get the following:

 id      select_type     table       type     possible_keys                                                                                                                                              key                                  key_len     ref                       rows     Extra                           
 ------  --------------  ----------  -------  ---------------------------------------------------------------------------------------------------------------------------------------------------------  -----------------------------------  ----------  ------------------------  -------  ------------------------------- 
 1       PRIMARY         d           const    PRIMARY                                                                                                                                                    PRIMARY                              4           const                     1        Using temporary; Using filesort 
 1       PRIMARY         c           ref      PRIMARY,fk_mtp_idx_geo_boundary_mtp_idx,fk_mtp_idx_geo_boundary_geo_boundary,fk_mtp_idx_geo_boundary_mtp_mem_idx,fk_mtp_idx_geo_boundary_geo_boundary_mem  fk_mtp_idx_geo_boundary_mtp_idx      4           const                     9                                        
 1       PRIMARY         b           eq_ref   PRIMARY,fk_geo_boundary_inquiry,fk_geo_boundary_geo_boundary_type                                                                                          PRIMARY                              4           gothim.c.geo_boundary_id  1        Using where                     
 1       PRIMARY         e           eq_ref   PRIMARY                                                                                                                                                    PRIMARY                              4           gothim.b.inquiry_id       1        Using where                     
 1       PRIMARY         f           ref      fk_inquiry_device_mem_inquiry                                                                                                                              fk_inquiry_device_mem_inquiry        4           gothim.e.inquiry_id       2                                        
 1       PRIMARY         a           ref      fk_geo_boundary_vertex_geo_boundary                                                                                                                        fk_geo_boundary_vertex_geo_boundary  4           gothim.b.geo_boundary_id  11       Using where                     
 2       UNION           d           const    PRIMARY                                                                                                                                                    PRIMARY                              4           const                     1        Using temporary; Using filesort 
 2       UNION           c           ref      PRIMARY,fk_mtp_idx_geo_boundary_mtp_idx,fk_mtp_idx_geo_boundary_geo_boundary,fk_mtp_idx_geo_boundary_mtp_mem_idx,fk_mtp_idx_geo_boundary_geo_boundary_mem  fk_mtp_idx_geo_boundary_mtp_idx      4           const                     9                                        
 2       UNION           b           eq_ref   PRIMARY,fk_geo_boundary_inquiry,fk_geo_boundary_geo_boundary_type                                                                                          PRIMARY                              4           gothim.c.geo_boundary_id  1        Using where                     
 2       UNION           e           eq_ref   PRIMARY                                                                                                                                                    PRIMARY                              4           gothim.b.inquiry_id       1        Using where                     
 2       UNION           f           ref      fk_inquiry_device_mem_inquiry                                                                                                                              fk_inquiry_device_mem_inquiry        4           gothim.e.inquiry_id       2                                        
 (null)  UNION RESULT    <union1,2>  ALL      (null)                                                                                                                                                     (null)                               (null)      (null)                    (null)   Using filesort                  

 12 record(s) selected [Fetch MetaData: 1ms] [Fetch Data: 5ms] 

Now, I can split the queries up and use the ORDER BY NULL trick to get rid of the filesort however when I attempt to add that to the end of a UNION it doesn't work.

I am considering splitting the query apart into 2 queries or possibly re-writing it completely not to use a UNION (though that is a bit more difficult of course). The other thing I have working against me is that we have this in production and I'd like to limit changes - I would have loved just to be able to add ORDER BY NULL to the end of the query and be done with it, but it doesn't work w/ the UNION.

Any help would be greatly appreciated.

like image 250
Dave LeJeune Avatar asked Mar 30 '11 13:03

Dave LeJeune


1 Answers

Normally, ORDER BY can be used for the individual queries within a UNION like this:

(
SELECT  *
FROM    table1, …
GROUP BY
        id
ORDER BY 
        NULL
)
UNION ALL
(
SELECT  *
FROM    table2, …
GROUP BY
        id
ORDER BY 
        NULL
)

However, as the docs state:

However, use of ORDER BY for individual SELECT statements implies nothing about the order in which the rows appear in the final result because UNION by default produces an unordered set of rows. Therefore, the use of ORDER BY in this context is typically in conjunction with LIMIT, so that it is used to determine the subset of the selected rows to retrieve for the SELECT, even though it does not necessarily affect the order of those rows in the final UNION result. If ORDER BY appears without LIMIT in a SELECT, it is optimized away because it will have no effect anyway.

This is of course a smart move, however, not too smart, since they forgot to optimize away the ordering behavior of GROUP BY as well.

So as for now, you should add a very high LIMIT to your individual queries:

(
SELECT  *
FROM    table1, …
GROUP BY
        id
ORDER BY 
        NULL
LIMIT 100000000
)
UNION ALL
(
SELECT  *
FROM    table2, …
GROUP BY
        id
ORDER BY 
        NULL
LIMIT 100000000
)

I'll post it as a bug to MySQL, hope they'll fix it in the next release, but meanwhile you could use this solution.

Note that a similar solution (using TOP 100%) was used to force ordering of the subqueries in SQL Server 2000, however, it stopped working in 2005 (ORDER BY has no effect in subqueries with TOP 100% for the optimizer).

It is safe to use it though since it won't break your queries even if the optimizer behavior changes in the next releases, but will just make them as slow as they are now.

like image 189
Quassnoi Avatar answered Oct 24 '22 19:10

Quassnoi