Mysql: Optimize count query on derived table

Tags:

mysql

I am trying to do a count query on a derived table for paging purposes. The query look like this:

SELECT 
    assignment_completions.id as id,
    assignment_completions.first_name,
    assignment_completions.last_name,
    groups.name

FROM
    assignment_completions
        LEFT JOIN
    groups_users ON assignment_completions.user_id = groups_users.user_id
        LEFT JOIN
    groups ON groups_users.group_id = groups.id
WHERE
    assignment_completions.handler = 'course'
GROUP BY assignment_completions.id

The count query just wraps the above query like this:

SELECT COUNT(*) FROM (...) AS assignment_count

The query without the count executes in .005 seconds. The query with the count executes in 1.5 seconds.

I have tried the following with no luck:

1) Use indexed column (there is no performance gain here):

SELECT COUNT (id) FROM (...)

2) I have tried using SQL_CALC_FOUND_ROWS but it was actually a little slower (2 seconds or so).

Details:

assignment_completions: 200k rows

users: 35k rows

groups_users: 500k rows

groups: 1k rows

Table Definitions

CREATE TABLE `assignment_completions` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `assignment_id` int(11) DEFAULT NULL,
  `handler` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `handler_id` int(11) DEFAULT NULL,
  `time_started` datetime DEFAULT NULL,
  `time_end` datetime DEFAULT NULL,
  `status` int(11) DEFAULT NULL,
  `application_instance_id` int(11) DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `first_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `last_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_assignment_completions_on_first_name` (`first_name`) USING BTREE,
  KEY `index_assignment_completions_on_last_name` (`last_name`) USING BTREE,
  KEY `index_assignment_completions_on_user_id` (`user_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=200001 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `email` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `encrypted_password` varchar(255) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
  `reset_password_token` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `reset_password_sent_at` datetime DEFAULT NULL,
  `remember_created_at` datetime DEFAULT NULL,
  `sign_in_count` int(11) NOT NULL DEFAULT '0',
  `current_sign_in_at` datetime DEFAULT NULL,
  `last_sign_in_at` datetime DEFAULT NULL,
  `current_sign_in_ip` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `last_sign_in_ip` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `application_instance_id` int(11) DEFAULT NULL,
  `username` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `first_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `last_name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `status` int(11) DEFAULT NULL,
  `group_list_cache` text COLLATE utf8_unicode_ci,
  PRIMARY KEY (`id`),
  UNIQUE KEY `index_users_on_reset_password_token` (`reset_password_token`) USING BTREE,
  UNIQUE KEY `index_users_on_username_and_application_instance_id` (`username`,`application_instance_id`) USING BTREE,
  KEY `index_users_on_application_instance_id` (`application_instance_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=30006 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `groups_users` (
  `group_id` int(11) DEFAULT NULL,
  `user_id` int(11) DEFAULT NULL,
  UNIQUE KEY `index_groups_users_on_group_id_and_user_id` (`group_id`,`user_id`) USING BTREE,
  KEY `index_groups_users_on_group_id` (`group_id`) USING BTREE,
  KEY `index_groups_users_on_user_id` (`user_id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `groups` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `application_instance_id` int(11) DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  `description` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `group_type` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1045 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

EXPLAIN for query:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: assignment_completions
         type: index
possible_keys: PRIMARY,index_assignment_completions_on_first_name,index_assignment_completions_on_last_name,index_assignment_completions_on_user_id
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 199088
     filtered: 100.00
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: lms.assignment_completions.user_id
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: groups_users
         type: ref
possible_keys: index_groups_users_on_user_id
          key: index_groups_users_on_user_id
      key_len: 5
          ref: lms.users.id
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: groups
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: lms.groups_users.group_id
         rows: 1
     filtered: 100.00
        Extra: Using index

EXPLAIN for count query:

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 199088
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 2
  select_type: DERIVED
        table: assignment_completions
         type: index
possible_keys: PRIMARY,index_assignment_completions_on_first_name,index_assignment_completions_on_last_name,index_assignment_completions_on_user_id
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 199088
     filtered: 100.00
        Extra: Using where
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: users
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: lms.assignment_completions.user_id
         rows: 1
     filtered: 100.00
        Extra: Using index
*************************** 4. row ***************************
           id: 2
  select_type: DERIVED
        table: groups_users
         type: ref
possible_keys: index_groups_users_on_user_id
          key: index_groups_users_on_user_id
      key_len: 5
          ref: lms.users.id
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 5. row ***************************
           id: 2
  select_type: DERIVED
        table: groups
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: lms.groups_users.group_id
         rows: 1
     filtered: 100.00
        Extra: Using index

I need to count the total results for paging purposes.

EDIT

Occasionally this query is modified, which is why the groups are joined. Sometimes a where clause is added for groups:

AND groups.name LIKE "%abc%"

For this reason the join with the groups table is necessary.

931

asked Mar 19 '14 18:03

2 Answers

To process a SELECT COUNT(*) FROM t statement, InnoDB scans an index of the table, which takes some time if the index is not entirely in the buffer pool. If your table does not change often, using the MySQL query cache is a good solution. To get a fast count.

You can also force InnoDB to use an index:

  SELECT COUNT(id) FROM assignment_completions USE INDEX (PRIMARY);

Other then that , i see you using many indexes,which will slow your query.

try use only the index on id which you will count on.

Innodb does not have cached row count. so count(*) without where clause is slow with Innodb tables.

this may help you

156

answered Oct 19 '22 22:10

I would have the following covering indexes on the data so the engine does not have to go back to the raw data pages. Also, since the only purpose you are going from users to groups_users, but the groups users is based on the user id anyhow, so from the assignment completions directly to the groups_users table

table                   index
assignment_completions  ( handler, id, user_id, first_name, last_name )
groups_users            ( user_id, group_id )
groups                  ( id, name )

SELECT STRAIGHT_JOIN
      assignment_completions.id as id,
      assignment_completions.first_name,
      assignment_completions.last_name,
      groups.name
   FROM
      assignment_completions
         LEFT JOIN groups_users 
            ON assignment_completions.user_id = groups_users.user_id
            LEFT JOIN groups 
               ON groups_users.group_id = groups.id
   WHERE
      assignment_completions.handler = 'course'
   GROUP BY 
      assignment_completions.id

@orourkedd, I looked again at the table structure. Really, would you ever expect a name (first/last) to EVER be more than 20-25 characters? Also, the handler field to 255 chars? Really? If that is allowing the data pages to be over-bloated, could that be part of a delay issue? Similarly in the user's table. Also, if the assignment_completion's first / last name is that of the user who completed the assignment, storing just the user's ID to the table instead would be MUCH easier long term and shorten the disk requirements on assignment completions. Yes, an extra join, but since it would be based on an "ID" column it would be fast, especially if the user's table was a covering index on (id, first_name, last_name), but definitely would shrink down the size of those columns. Similarly to the groups "name" column. Because I don't know how the actual indexes are prepared, if it IS building out allocation of 255 per field, that could hinder the indexed performance.

After re-reviewing my own comments. It appears you have a handler lookup table to (by reference to the handler_id column). I would change the index from (handler, ...) to (handler_id, ...) on the assignment_completions table and REMOVE the handler column as the duplication over 200k rows vs a single lookup table ID pointer will cut amount of pages to blow through. Likewise for saving a "userID" vs the full first/last name in the assignment completions table. This is forcing 3 columns of up to 255 chars each for 200k records (as of now and obviously going to grow).

answered Oct 19 '22 23:10

DRapp

Related questions
                            
                                Convert a Website with a responsive design to the Android app [closed]
                            
                                MySQL - what happens when an UPDATE command is interrupted?
                            
                                Should it be possible to create a Foreign Key where column size doesn't match
                            
                                Can I define multiple events in one Trigger declaration in mysql?
                            
                                group concat equivalent in pig?
                            
                                Is extending classes good practice?
                            
                                Datetime format PHP and MySQL
                            
                                update row if count(*) > n
                            
                                How to display result and or any message in trigger body?
                            
                                MySQL - How to use a variable as column name
                            
                                Store MySQL password in bash-script
                            
                                Ebean and many-to-many relationships - how?
                            
                                How to select and insert values in mysql using stored procedures
                            
                                MySQL - show query by query id in mysql < 5.1.7
                            
                                Can you make an inheritance diagram in MySQL Workbench?
                            
                                icontains in Django QuerySet doesn't give the case insensitive result?
                            
                                Any alternative to MONTH() and YEAR() functions which are common to both MySQL and PostgreSQL
                            
                                How to import multiple database at once through PhpMyAdmin?
                            
                                Mysql - flexible, excel-like structure
                            
                                MySQL innodb - Foreign Key: only 1st works?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Mysql: Optimize count query on derived table

Tags:

performance

mysql

orourkedd

People also ask

2 Answers

echo_Me

DRapp

Recent Activity

Donate For Us