Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing MySQL query with expensive INNER JOIN

Using trial and error i've discovered that when removing a join from the below query it runs around 30 times quicker. Can someone explain why this would be and if it's possible to optimise the query to include the additional join without the performance hit.

This is a screenshot of the explain which shows that the index isn't being used for the uesr_groups table.

enter image description here

http://i.imgur.com/9VDuV.png

This is the original query:

SELECT `comments`.`comment_id`, `comments`.`comment_html`, `comments`.`comment_time_added`, `comments`.`comment_has_attachments`, `users`.`user_name`, `users`.`user_id`, `users`.`user_comments_count`, `users`.`user_time_registered`, `users`.`user_time_last_active`, `user_profile`.`user_avatar`, `user_profile`.`user_signature_html`, `user_groups`.`user_group_icon`, `user_groups`.`user_group_name`
FROM (`comments`)
INNER JOIN `users` ON `comments`.`comment_user_id` = `users`.`user_id`
INNER JOIN `user_profile` ON `users`.`user_id` = `user_profile`.`user_id`
INNER JOIN `user_groups` ON `users`.`user_group_id` = `user_groups`.`user_group_id`
WHERE `comments`.`comment_enabled` =  1
AND `comments`.`comment_content_id` =  12
ORDER BY `comments`.`comment_time_added` ASC
LIMIT 20

If I remove the "user_groups" join then the query runs 30 times quicker as mentioned above.

SELECT `comments`.`comment_id`, `comments`.`comment_html`, `comments`.`comment_time_added`, `comments`.`comment_has_attachments`, `users`.`user_name`, `users`.`user_id`, `users`.`user_comments_count`, `users`.`user_time_registered`, `users`.`user_time_last_active`, `user_profile`.`user_avatar`, `user_profile`.`user_signature_html`
FROM (`comments`)
INNER JOIN `users` ON `comments`.`comment_user_id` = `users`.`user_id`
INNER JOIN `user_profile` ON `users`.`user_id` = `user_profile`.`user_id`
WHERE `comments`.`comment_enabled` =  1
AND `comments`.`comment_content_id` =  12
ORDER BY `comments`.`comment_time_added` ASC
LIMIT 20

My tables are below, can anyone offer any insight into how to avoid a performance hit for including the user_groups table?

--
-- Table structure for table `comments`
--

CREATE TABLE IF NOT EXISTS `comments` (
  `comment_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `comment_content_id` int(10) unsigned NOT NULL,
  `comment_user_id` mediumint(6) unsigned NOT NULL,
  `comment_original` text NOT NULL,
  `comment_html` text NOT NULL,
  `comment_time_added` int(10) unsigned NOT NULL,
  `comment_time_updated` int(10) unsigned NOT NULL,
  `comment_enabled` tinyint(1) NOT NULL DEFAULT '0',
  `comment_is_spam` tinyint(1) NOT NULL DEFAULT '0',
  `comment_has_attachments` tinyint(1) unsigned NOT NULL,
  `comment_has_edits` tinyint(1) NOT NULL,
  PRIMARY KEY (`comment_id`),
  KEY `comment_user_id` (`comment_user_id`),
  KEY `comment_content_id` (`comment_content_id`),
  KEY `comment_is_spam` (`comment_is_spam`),
  KEY `comment_enabled` (`comment_enabled`),
  KEY `comment_time_updated` (`comment_time_updated`),
  KEY `comment_time_added` (`comment_time_added`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=352 ;

-- --------------------------------------------------------

--
-- Table structure for table `users`
--

CREATE TABLE IF NOT EXISTS `users` (
  `user_id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
  `user_ipb_id` int(10) unsigned DEFAULT NULL,
  `user_activated` tinyint(1) NOT NULL DEFAULT '0',
  `user_name` varchar(64) CHARACTER SET latin1 NOT NULL,
  `user_email` varchar(255) NOT NULL,
  `user_password` varchar(40) NOT NULL,
  `user_content_count` int(10) unsigned NOT NULL DEFAULT '0',
  `user_comments_count` int(10) unsigned NOT NULL DEFAULT '0',
  `user_salt` varchar(8) NOT NULL,
  `user_api_key` varchar(32) NOT NULL,
  `user_auth_key` varchar(32) DEFAULT NULL,
  `user_paypal_key` varchar(32) DEFAULT NULL,
  `user_timezone_id` smallint(3) unsigned NOT NULL,
  `user_group_id` tinyint(3) unsigned NOT NULL,
  `user_custom_permission_mask_id` tinyint(3) unsigned DEFAULT NULL,
  `user_lang_id` tinyint(2) unsigned NOT NULL,
  `user_time_registered` int(10) unsigned NOT NULL,
  `user_time_last_active` int(10) unsigned NOT NULL
  PRIMARY KEY (`user_id`),
  UNIQUE KEY `user_email` (`user_email`),
  KEY `user_group_id` (`user_group_id`),
  KEY `user_auth_key` (`user_auth_key`),
  KEY `user_api_key` (`user_api_key`),
  KEY `user_custom_permission_mask_id` (`user_custom_permission_mask_id`),
  KEY `user_time_last_active` (`user_time_last_active`),
  KEY `user_paypal_key` (`user_paypal_key`),
  KEY `user_name` (`user_name`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=33 ;

-- --------------------------------------------------------

--
-- Table structure for table `user_groups`
--

CREATE TABLE IF NOT EXISTS `user_groups` (
  `user_group_id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
  `user_group_name` varchar(32) NOT NULL,
  `user_group_permission_mask_id` tinyint(3) unsigned NOT NULL,
  `user_group_icon` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`user_group_id`),
  KEY `user_group_permission_mask_id` (`user_group_permission_mask_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;

-- --------------------------------------------------------

--
-- Table structure for table `user_profile`
--

CREATE TABLE IF NOT EXISTS `user_profile` (
  `user_id` mediumint(8) unsigned NOT NULL,
  `user_signature_original` text,
  `user_signature_html` text,
  `user_avatar` varchar(64) DEFAULT NULL,
  `user_steam_id` varchar(64) DEFAULT NULL,
  `user_ps_id` varchar(16) DEFAULT NULL,
  `user_xbox_id` varchar(64) DEFAULT NULL,
  `user_wii_id` varchar(64) DEFAULT NULL,
  PRIMARY KEY (`user_id`),
  KEY `user_steam_id` (`user_steam_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
like image 283
John Mellor Avatar asked Jan 04 '12 12:01

John Mellor


2 Answers

Most database engines calculate their query plan based on statistics about the tables - for instance, if a table has a small number of rows, it's quicker to go to the table than the index. Those statistics are maintained during "normal" operation - e.g. inserts, updates and deletes - but can get out of sync when table definitions are changed, or when you do bulk inserts.

If you see unexpected behaviour in the query plan, you can force the database to update its statistics; in MySQL you can use Optimize Table - which does everything, including re-ordering the table itself, or Analyze Table which only updates the indices.

This is hard to do on production environments, as both operations lock the tables; if you can possibly negotiate a maintenance window, that's by far the simplest way to deal with the problem.

It's worth measuring performance of "optimize table" - on well-specified hardware, it should take only a couple of seconds for "normal" size tables (up to low millions of records, with only a few indices). That might mean you can have an "informal" maintenance window - you don't take the application off-line, you just accept that some users will have degraded performance while you're running the scripts.

like image 51
Neville Kuyt Avatar answered Oct 19 '22 23:10

Neville Kuyt


MySQL has an EXPLAIN feature which will help you to understand the query:

$ mysql
> EXPLAIN SELECT `comments`.`comment_id`, `comments`.`comment_html`,`comments`.`comment_time_added`, `comments`.`comment_has_attachments`, `users`.`user_name`, `users`.`user_id`, `users`.`user_comments_count`, `users`.`user_time_registered`, `users`.`user_time_last_active`, `user_profile`.`user_avatar`, `user_profile`.`user_signature_html`
  FROM (`comments`)
  INNER JOIN `users` ON `comments`.`comment_user_id` = `users`.`user_id`
  INNER JOIN `user_profile` ON `users`.`user_id` = `user_profile`.`user_id`
  WHERE `comments`.`comment_enabled` =  1
  AND `comments`.`comment_content_id` =  12
  ORDER BY `comments`.`comment_time_added` ASC
  LIMIT 20

MySQL might simply be missing, or skipping an index.

You can learn more about understanding the output of EXPLAIN here from the documentation (a little hard-core), or better yet from a simpler explanation here, (ignore the fact that it's on a Java site.)

More than likely the amount of data, or an outdated or incomplete index is meaning that MySQL is falsely doing a table scan. When you see table scans, or sequential serches, you can often easily see which field is missing an index, or an index which is not usable.

like image 23
Lee Hambley Avatar answered Oct 19 '22 22:10

Lee Hambley