Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mysql Partitioning Query Performance

i have created partitions on pricing table. below is the alter statement.

ALTER TABLE `price_tbl` 
PARTITION BY HASH(man_code)
PARTITIONS 87;

one partition consists of 435510 records. total records in price_tbl is 6 million.

EXPLAIN query showing only one partion is used for the query . Still the query takes 3-4 sec to execute. below is the query

 EXPLAIN SELECT vrimg.image_cap_id,vm.man_name,vr.range_code,vr.range_name,vr.range_url, MIN(`finance_rental`) AS from_price, vd.der_id AS vehicle_id FROM `range_tbl` vr 
    LEFT JOIN `image_tbl` vrimg ON vr.man_code = vrimg.man_code AND vr.type_id = vrimg.type_id AND vr.range_code = vrimg.range_code 
    LEFT JOIN `manufacturer_tbl` vm ON vr.man_code = vm.man_code AND vr.type_id = vm.type_id 
    LEFT JOIN `derivative_tbl` vd ON vd.man_code=vm.man_code AND vd.type_id = vr.type_id AND vd.range_code=vr.range_code 
    LEFT JOIN `price_tbl` vp ON vp.vehicle_id = vd.der_id AND vd.type_id = vp.type_id AND vp.product_type_id=1 AND vp.maintenance_flag='N'  AND vp.man_code=164 
    AND vp.initial_rentals_id =(SELECT rental_id FROM `rentals_tbl` WHERE rental_months='9') 
    AND vp.annual_mileage_id =(SELECT annual_mileage_id FROM `mileage_tbl` WHERE annual_mileage='8000') 
    WHERE vr.type_id = 1 AND vm.man_url = 'audi' AND vd.type_id IS NOT NULL GROUP BY vd.der_id

Result of EXPLAIN.

enter image description here

Same query without partitioning takes 3-4 sec. Query with partitioning takes 2-3 sec.

how we can increase query performance as it is too slow yet.

attached create table structure.

  1. price table - This consists 6 million records
CREATE TABLE `price_tbl` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `lender_id` bigint(20) DEFAULT NULL,
  `type_id` bigint(20) NOT NULL,
  `man_code` bigint(20) NOT NULL,
  `vehicle_id` bigint(20) DEFAULT NULL,
  `product_type_id` bigint(20) DEFAULT NULL,
  `initial_rentals_id` bigint(20) DEFAULT NULL,
  `term_id` bigint(20) DEFAULT NULL,
  `annual_mileage_id` bigint(20) DEFAULT NULL,
  `ref` varchar(255) DEFAULT NULL,
  `maintenance_flag` enum('Y','N') DEFAULT NULL,
  `finance_rental` decimal(20,2) DEFAULT NULL,
  `monthly_rental` decimal(20,2) DEFAULT NULL,
  `maintenance_payment` decimal(20,2) DEFAULT NULL,
  `initial_payment` decimal(20,2) DEFAULT NULL,
  `doc_fee` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`,`type_id`,`man_code`),
  KEY `type_id` (`type_id`),
  KEY `vehicle_id` (`vehicle_id`),
  KEY `term_id` (`term_id`),
  KEY `product_type_id` (`product_type_id`),
  KEY `finance_rental` (`finance_rental`),
  KEY `type_id_2` (`type_id`,`vehicle_id`),
  KEY `maintenanace_idx` (`maintenance_flag`),
  KEY `lender_idx` (`lender_id`),
  KEY `initial_idx` (`initial_rentals_id`),
  KEY `man_code_idx` (`man_code`)
) ENGINE=InnoDB AUTO_INCREMENT=5830708 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (man_code)
PARTITIONS 87 */
  1. derivative table - This consists 18k records.
CREATE TABLE `derivative_tbl` (
  `type_id` bigint(20) DEFAULT NULL,
  `der_cap_code` varchar(20) DEFAULT NULL,
  `der_id` bigint(20) DEFAULT NULL,
  `body_style_id` bigint(20) DEFAULT NULL,
  `fuel_type_id` bigint(20) DEFAULT NULL,
  `trans_id` bigint(20) DEFAULT NULL,
  `man_code` bigint(20) DEFAULT NULL,
  `range_code` bigint(20) DEFAULT NULL,
  `model_code` bigint(20) DEFAULT NULL,
  `der_name` varchar(255) DEFAULT NULL,
  `der_url` varchar(255) DEFAULT NULL,
  `der_intro_year` date DEFAULT NULL,
  `der_disc_year` date DEFAULT NULL,
  `der_last_spec_date` date DEFAULT NULL,
  KEY `der_id` (`der_id`),
  KEY `type_id` (`type_id`),
  KEY `man_code` (`man_code`),
  KEY `range_code` (`range_code`),
  KEY `model_code` (`model_code`),
  KEY `body_idx` (`body_style_id`),
  KEY `capcodeidx` (`der_cap_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
  1. range table - This consists 1k records
CREATE TABLE `range_tbl` (
  `type_id` bigint(20) DEFAULT NULL,
  `man_code` bigint(20) DEFAULT NULL,
  `range_code` bigint(20) DEFAULT NULL,
  `range_name` varchar(255) DEFAULT NULL,
  `range_url` varchar(255) DEFAULT NULL,
  KEY `range_code` (`range_code`),
  KEY `type_id` (`type_id`),
  KEY `man_code` (`man_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
like image 342
jit Avatar asked Jun 26 '26 23:06

jit


1 Answers

PARTITION BY HASH is essentially useless if you are hoping for improved performance. BY RANGE is useful in a few use cases_.

In most situations, improvements in indexes are as good as trying to use partitioning.

Some likely problems:

  • No explicit PRIMARY KEY for InnoDB tables. Add a natural PK, if applicable, else an AUTO_INCREMENT.
  • No "composite" indexes -- they often provide a performance boost. Example: The LEFT JOIN between vr and vrimg involves 3 columns; a composite index on those 3 columns in the 'right' table will probably help performance.
  • Blind use of BIGINT when smaller datatypes would work. (This is an I/O issue when the table is big.)
  • Blind use of 255 in VARCHAR.
  • Consider whether most of the columns should be NOT NULL.
  • That query may be a victim of the "explode-implode" syndrome. This is where you do JOIN(s), which create a big intermediate table, followed by a GROUP BY to bring the row-count back down.
  • Don't use LEFT unless the 'right' table really is optional. (I see LEFT JOIN vd ... vd.type_id IS NOT NULL.)
  • Don't normalize "continuous" values (annual_mileage and rental_months). It is not really beneficial for "=" tests, and it severely hurts performance for "range" tests.

Same query without partitioning takes 3-4 sec. Query with partitioning takes 2-3 sec.

The indexes almost always need changing when switching between partitioning and non-partitioning. With the optimal indexes for each case, I predict that performance will be close to the same.

Indexes

These should help performance whether or not it is partitioned:

vm:     (man_url)
vr:     (man_code, type_id)  -- either order
vd:     (man_code, type_id, range_code, der_id)
              -- `der_id` 4th, else in any order (covering)
vrimg:  (man_code, type_id, range_code, image_cap_id)
              -- `image_cap_id` 4th, else in any order (covering)
vp:     (type_id, der_id, product_type_id, maintenance_flag,
         initial_rentals, annual_mileage, man_code)
             -- any order (covering)

A "covering" index is an extra boost, in that it can do all the work just in the index's BTree, without touching the data's BTree.

Implement a bunch of what I recommend, then come back (in another Question) for further tweaking.

Usually the "partition key" should be last in a composite index.

like image 188
Rick James Avatar answered Jun 29 '26 14:06

Rick James