Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

After DB version change, index won't be used automatically

We recently migrated our Magento application from a private host to AWS web services. We noticed that some of the internal functionality of Magento was taking an excessively long period of time to execute after the migration, so started to investigate.

One of the queries in question is a simple customer select query, with around 9-10 regular joins to attribute tables to get the attributes.

We have run some tests on the query and found that the difference between the old host and AWS is that on the old host, the MySQL optimizer appears to use the correct index, whereas in AWS it resorts to using filesort, ignoring the index.

Using FORCE INDEX(index_name) makes the query execute correctly in AWS, however we don't want to go down this road and would rather fix the issue in the database configuration than make manual hacks throughout our Magento application. To be clear, this is not an issue with our indexes, they are set up correctly.

For background:

  • I have copied all the MySQL parameters from the old host's my.cnf file to a parameter group in RDS, but nothing made any difference
  • The tables are all InnoDB
  • I've run analyze, repair and optimize queries etc
  • The query takes around 45 seconds to complete on RDS
  • The query took around 2 seconds to complete on the old host, or when I use FORCE INDEX() to force RDS to behave in the same way as the old host

The old MySQL server was running version 5.1.61, and the AWS RDS instance we are running is on 5.6.19. A consulting group suggested to us that we downgrade our RDS instance to 5.1.61, however again we don't want to do this as it is not a sustainable solution.

The query in question is below (shortened by removing fields from the select for the sake of space):

SELECT
    `e`.*
    -- various field names here, removed
FROM `customer_entity` AS `e`
LEFT JOIN `customer_entity_int` AS `at_default_billing` ON (`at_default_billing`.`entity_id` = `e`.`entity_id`) AND (`at_default_billing`.`attribute_id` = '13')
LEFT JOIN `customer_address_entity_varchar` AS `at_billing_postcode` ON (`at_billing_postcode`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_postcode`.`attribute_id` = '30')
LEFT JOIN `customer_address_entity_varchar` AS `at_billing_city` ON (`at_billing_city`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_city`.`attribute_id` = '26')
LEFT JOIN `customer_address_entity_varchar` AS `at_billing_telephone` ON (`at_billing_telephone`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_telephone`.`attribute_id` = '31')
LEFT JOIN `customer_address_entity_varchar` AS `at_billing_regione` ON (`at_billing_regione`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_regione`.`attribute_id` = '28')
LEFT JOIN `customer_address_entity_varchar` AS `at_billing_country_id` ON (`at_billing_country_id`.`entity_id` = `at_default_billing`.`value`) AND (`at_billing_country_id`.`attribute_id` = '27')
LEFT JOIN `core_store` AS `at_store_name` ON (at_store_name.`store_id`=e.store_id)
LEFT JOIN `customer_entity_varchar` AS `firstname` ON e.entity_id = firstname.entity_id AND firstname.attribute_id = 5
LEFT JOIN `customer_entity_varchar` AS `lastname` ON e.entity_id = lastname.entity_id AND lastname.attribute_id = 7 

WHERE (`e`.`entity_type_id` = '1') 

ORDER BY `e`.`entity_id`
DESC LIMIT 20;

A summary of the differences between servers when EXPLAINing this query is below:

RDS - MySQL v5.6.19:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: e
         type: ref
possible_keys: IDX_CUSTOMER_ENTITY_ENTITY_TYPE_ID
          key: IDX_CUSTOMER_ENTITY_ENTITY_TYPE_ID
      key_len: 2
          ref: const
         rows: 653990
        Extra: Using temporary; Using filesort

Old host - MySQL v5.1.61, or when FORCE INDEX is used on RDS:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: e
         type: ref
possible_keys: IDX_CUSTOMER_ENTITY_ENTITY_TYPE_ID
          key: IDX_CUSTOMER_ENTITY_ENTITY_TYPE_ID
      key_len: 2
          ref: const
         rows: 644775
        Extra: Using where

I'm aware that it's highly likely that the difference in those database versions has changed the way the query optimizer works, and I'm not aware of what the differences in those versions are, but I'm looking for a solution that will help us to address those differences.


Edit: here's a comparison list from RDS for the parameter's I've copied over from the old host vs the default parameters for this MySQL version. None of these parameters have affected the result above whether they are there or whether I used standard parameters in RDS:

enter image description here

like image 898
scrowler Avatar asked Oct 30 '14 22:10

scrowler


1 Answers

Looking at the query you provided, it looks like you might be getting snagged by Bug #74030. A patch was contributed for a later version (5.6.20 and 5.7.4), but hasn't been applied yet, looking at the release notes. It might be good to sign-in and vote for that bug.

For now, as much as it pains me, your consultant group may be right (accidentally, perhaps)..until the patch is applied.

like image 108
Steve Siebert Avatar answered Sep 22 '22 03:09

Steve Siebert