Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group by and group concat , optimization mysql query without using main pk

my example is on MYSQL VERSION is 5.6.34-log

Problem summary the below query takes 40 seconds, ORDER_ITEM table

  • has 758423 records

    And PAYMENT table

  • has 177272 records

And submission_entry table

  • has 2165698 records

    as A Whole Table count.

DETAILS HERE: BELOW:

  • I Have This Query, Refer to [1]

  • I Have added SQL_NO_CACHE for testing repeated tests when re
    query.

  • I Have Optimized indexes Refer to [2], but no significant
    improvement.

  • Find Table Structures here [3]

  • Find explain plan used [4]

[1]

     SELECT SQL_NO_CACHE
          `payment`.`id`                                                                                    AS id,
          `order_item`.`order_id`                                                                           AS order_id,


          GROUP_CONCAT(DISTINCT (CASE WHEN submission_entry.text = '' OR submission_entry.text IS NULL
            THEN ' '
                                 ELSE submission_entry.text END) ORDER BY question.var DESC SEPARATOR 0x1D) AS buyer,


          event.name                                                                                        AS event,
          COUNT(DISTINCT CASE WHEN (`order_item`.status > 0 OR (
            `order_item`.status != -1 AND `order_item`.status >= -2 AND `payment`.payment_type_id != 8 AND
            payment.make_order_free = 1))
            THEN `order_item`.id
                         ELSE NULL END)                                                                     AS qty,
          payment.currency                                                                                  AS `currency`,
          (SELECT SUM(order_item.sub_total)
           FROM order_item
           WHERE payment_id =
                 payment.id)                                                                                AS sub_total,
          CASE WHEN payment.make_order_free = 1
            THEN ROUND(payment.total + COALESCE(refunds_total, 0), 2)
          ELSE ROUND(payment.total, 2) END                                                                  AS 'total',
          `payment_type`.`name`                                                                             AS payment_type,
          payment_status.name                                                                               AS status,
          `payment_status`.`id`                                                                             AS status_id,
          DATE_FORMAT(CONVERT_TZ(order_item.`created`, '+0:00', '-8:00'),
                      '%Y-%m-%d %H:%i')                                                                     AS 'created',
          `user`.`name`                                                                                     AS 'agent',
          event.id                                                                                          AS event_id,
          payment.checked,
          DATE_FORMAT(CONVERT_TZ(payment.checked_date, '+0:00', '-8:00'),
                      '%Y-%m-%d %H:%i')                                                                     AS checked_date,
          DATE_FORMAT(CONVERT_TZ(`payment`.`complete_date`, '+0:00', '-8:00'),
                      '%Y-%m-%d %H:%i')                                                                     AS `complete date`,
          `payment`.`delivery_status`                                                                       AS `delivered`
        FROM `order_item`
          INNER JOIN `payment`
            ON payment.id = `order_item`.`payment_id` AND (payment.status > 0.0 OR payment.status = -3.0)
          LEFT JOIN (SELECT
                       sum(`payment_refund`.total) AS `refunds_total`,
                       payment_refunds.payment_id  AS `payment_id`
                     FROM payment
                       INNER JOIN `payment_refunds` ON payment_refunds.payment_id = payment.id
                       INNER JOIN `payment` AS `payment_refund`
                         ON `payment_refund`.id = `payment_refunds`.payment_id_refund
                     GROUP BY `payment_refunds`.payment_id) AS `refunds` ON `refunds`.payment_id = payment.id
#           INNER JOIN event_date_product ON event_date_product.id = order_item.event_date_product_id
#           INNER JOIN event_date ON event_date.id = event_date_product.event_date_id
          INNER JOIN event ON event.id = order_item.event_id
          INNER JOIN payment_status ON payment_status.id = payment.status
          INNER JOIN payment_type ON payment_type.id = payment.payment_type_id
          LEFT JOIN user ON user.id = payment.completed_by
          LEFT JOIN submission_entry ON submission_entry.form_submission_id = `payment`.`form_submission_id`
          LEFT JOIN question ON question.id = submission_entry.question_id AND question.var IN ('name', 'email')
        WHERE 1 = '1' AND (order_item.status > 0.0 OR order_item.status = -2.0)
        GROUP BY `order_item`.`order_id`
        HAVING 1 = '1'
        ORDER BY `order_item`.`order_id` DESC
        LIMIT 10

[2]

 CREATE INDEX order_id
      ON order_item (order_id);

    CREATE INDEX payment_id
      ON order_item (payment_id);

    CREATE INDEX status
      ON order_item (status);

Second Table

CREATE INDEX payment_type_id
  ON payment (payment_type_id);

CREATE INDEX status
  ON payment (status);

[3]

CREATE TABLE order_item
(
  id                         INT AUTO_INCREMENT
    PRIMARY KEY,
  order_id                   INT                                 NOT NULL,
  form_submission_id         INT                                 NULL,
  status                     DOUBLE DEFAULT '0'                  NULL,
  payment_id                 INT DEFAULT '0'                     NULL
);

SECOND TABLE

CREATE TABLE payment
(
  id                 INT AUTO_INCREMENT,
  payment_type_id    INT                                 NOT NULL,
  status             DOUBLE                              NOT NULL,
  form_submission_id INT                                 NOT NULL,
  PRIMARY KEY (id, payment_type_id)
);

[4] Run the snippet to see the table of EXPLAIN in HTML format

<!DOCTYPE html>
<html>
<head>
  <title></title>
</head>
<body>
<table border="1" style="border-collapse:collapse">
<tr><th>id</th><th>select_type</th><th>table</th><th>type</th><th>possible_keys</th><th>key</th><th>key_len</th><th>ref</th><th>rows</th><th>Extra</th></tr>
<tr><td>1</td><td>PRIMARY</td><td>payment_status</td><td>range</td><td>PRIMARY</td><td>PRIMARY</td><td>8</td><td>NULL</td><td>4</td><td>Using where; Using temporary; Using filesort</td></tr>
<tr><td>1</td><td>PRIMARY</td><td>payment</td><td>ref</td><td>PRIMARY,payment_type_id,status</td><td>status</td><td>8</td><td>exp_live_18092017.payment_status.id</td><td>17357</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>payment_type</td><td>eq_ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.payment.payment_type_id</td><td>1</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>user</td><td>eq_ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.payment.completed_by</td><td>1</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>submission_entry</td><td>ref</td><td>form_submission_id,idx_submission_entry_1</td><td>form_submission_id</td><td>4</td><td>exp_live_18092017.payment.form_submission_id</td><td>2</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>question</td><td>eq_ref</td><td>PRIMARY,var</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.submission_entry.question_id</td><td>1</td><td>Using where</td></tr>
<tr><td>1</td><td>PRIMARY</td><td>order_item</td><td>ref</td><td>status,payment_id</td><td>payment_id</td><td>5</td><td>exp_live_18092017.payment.id</td><td>3</td><td>Using where</td></tr>
<tr><td>1</td><td>PRIMARY</td><td>event</td><td>eq_ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.order_item.event_id</td><td>1</td><td></td></tr>
<tr><td>1</td><td>PRIMARY</td><td>&lt;derived3&gt;</td><td>ref</td><td>key0</td><td>key0</td><td>5</td><td>exp_live_18092017.payment.id</td><td>10</td><td>Using where</td></tr>
<tr><td>3</td><td>DERIVED</td><td>payment_refunds</td><td>index</td><td>payment_id,payment_id_refund</td><td>payment_id</td><td>4</td><td>NULL</td><td>1110</td><td></td></tr>
<tr><td>3</td><td>DERIVED</td><td>payment</td><td>ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.payment_refunds.payment_id</td><td>1</td><td>Using index</td></tr>
<tr><td>3</td><td>DERIVED</td><td>payment_refund</td><td>ref</td><td>PRIMARY</td><td>PRIMARY</td><td>4</td><td>exp_live_18092017.payment_refunds.payment_id_refund</td><td>1</td><td></td></tr>
<tr><td>2</td><td>DEPENDENT SUBQUERY</td><td>order_item</td><td>ref</td><td>payment_id</td><td>payment_id</td><td>5</td><td>func</td><td>3</td><td></td></tr></table>
</body>
</html>

Expected Restul

It has to be instead of 40 seconds less than 5

IMPORTANT Updates

1) Reply to comment 1: there is no foreign key at all on those two tables.

UPDATE-1: On local the original query takes 40 seconds if i removed only the following it becomes 25 seconds saves 15 seconds

GROUP_CONCAT(DISTINCT (CASE WHEN submission_entry.text = '' OR submission_entry.text IS NULL
    THEN ' '
                         ELSE submission_entry.text END) ORDER BY question.var DESC SEPARATOR 0x1D) AS buyer

if I removed only its the same time around 40 seconds no save!

COUNT(DISTINCT CASE WHEN (`order_item`.status > 0 OR (
    `order_item`.status != -1 AND `order_item`.status >= -2 AND `payment`.payment_type_id != 8 AND
    payment.make_order_free = 1))
    THEN `order_item`.id
                 ELSE NULL END)                                                                     AS qty,

if I removed only it takes around 36 seconds saves 4 seconds

(SELECT SUM(order_item.sub_total)
   FROM order_item
   WHERE payment_id =
         payment.id)                                                                                AS sub_total,
  CASE WHEN payment.make_order_free = 1
    THEN ROUND(payment.total + COALESCE(refunds_total, 0), 2)
  ELSE ROUND(payment.total, 2) END                                                                  AS 'total',
like image 771
shareef Avatar asked Oct 01 '17 07:10

shareef


1 Answers

Remove HAVING 1=1; the Optimizer may not be smart enough to ignore it. Please provide EXPLAIN SELECT (not in html) to see what the Optimizer is doing.

It seems wrong to have a composite PK in this case: PRIMARY KEY (id, payment_type_id). Please justify it.

Please explain the meaning of status or the need for DOUBLE: status DOUBLE

It will take some effort to figure out why the query is so slow. Let's start by tossing the normalization parts, such as dates and event name and currency. That is whittle down the query to enough to find the desired rows, but not the details on each row. If it is still slow, let's debug that. If it is 'fast', then add back on the other stuff, one by one, to find out what is causing a performance issue.

Is just id the PRIMARY KEY of each table? Or are there more exceptions (like payment)?

It seems 'wrong' to specify a value for question.var, but then use LEFT to imply that it is optional. Please change all LEFT JOINs to INNER JOINs unless I am mistaken on this issue.

Are any of the tables (perhaps submission_entry and event_date_product) "many-to-many" mapping tables? If so, then follow the tips here to get some performance gains.

When you come back please provide SHOW CREATE TABLE for each table.

like image 151
Rick James Avatar answered Sep 27 '22 22:09

Rick James