Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mysql count performance on very big tables

I have a table with more than 100 millions rows in Innodb.

I have to know if there is more than 5000 rows where the foreign key = 1. I don't need the exact number.

I made some testing :

SELECT COUNT(*) FROM table WHERE fk = 1 => 16 seconds
SELECT COUNT(*) FROM table WHERE fk = 1 LIMIT 5000 => 16 seconds
SELECT primary FROM table WHERE fk = 1 => 0.6 seconds

I will have a bigger network and treatment time but it can be an overload of 15.4 seconds !

Do you have a better idea ?

Thanks

Edit: [Added OP's relevant comments]

I tried SELECT SQL_NO_CACHE COUNT(fk) FROM table WHERE fk = 1 but it took 25 seconds

Mysql was tuned for Innodb with Mysql Tuner.

CREATE TABLE table ( pk bigint(20) NOT NULL AUTO_INCREMENT,
fk tinyint(3) unsigned DEFAULT '0', 
PRIMARY KEY (pk), KEY idx_fk (fk) USING BTREE ) 
ENGINE=InnoDB AUTO_INCREMENT=100380914 DEFAULT CHARSET=latin1

DB Stuff:

'have_innodb', 'YES' 'ignore_builtin_innodb', 'OFF' 'innodb_adaptive_hash_index', 'ON'    
'innodb_additional_mem_pool_size', '20971520' 'innodb_autoextend_increment', '8' 
'innodb_autoinc_lock_mode', '1' 'innodb_buffer_pool_size', '25769803776' 
'innodb_checksums', 'ON' 'innodb_commit_concurrency', '0',
'innodb_concurrency_tickets', '500' 'innodb_data_file_path',
'ibdata1:10M:autoextend' 'innodb_data_home_dir', '', 'innodb_doublewrite', 'ON'     
'innodb_fast_shutdown', '1' 'innodb_file_io_threads', '4' 
'innodb_file_per_table', 'OFF', 'innodb_flush_log_at_trx_commit', '1' 
'innodb_flush_method', '' 'innodb_force_recovery', '0' 'innodb_lock_wait_timeout', '50' 
'innodb_locks_unsafe_for_binlog', 'OFF' 'innodb_log_buffer_size', '8388608' 
'innodb_log_file_size', '26214400' 'innodb_log_files_in_group', '2' 
'innodb_log_group_home_dir', './' 'innodb_max_dirty_pages_pct', '90'     
'innodb_max_purge_lag', '0' 'innodb_mirrored_log_groups', '1' 'innodb_open_files', 
'300' 'innodb_rollback_on_timeout', 'OFF' 'innodb_stats_on_metadata', 'ON' 
'innodb_support_xa', 'ON' 'innodb_sync_spin_loops', '20' 'innodb_table_locks', 'ON' 
'innodb_thread_concurrency', '8' 'innodb_thread_sleep_delay', '10000'      
'innodb_use_legacy_cardinality_algorithm', 'ON'

Update '15: I used the same method up to now with 600 millions rows and 640 000 new rows per day. It's still working fine.

like image 531
hotips Avatar asked Jun 11 '12 07:06

hotips


2 Answers

You don't seem interested in the actual count so give this a try:

SELECT 1 FROM table WHERE fk = 1 LIMIT 5000, 1

If a row is returned, you have 5000 and more records. I presume the fk column is indexed.

like image 129
Salman A Avatar answered Nov 09 '22 15:11

Salman A


Counter tables or other caching mechanism is the solution:

InnoDB does not keep an internal count of rows in a table because concurrent transactions might “see” different numbers of rows at the same time. To process a SELECT COUNT(*) FROM t statement, InnoDB scans an index of the table, which takes some time if the index is not entirely in the buffer pool. If your table does not change often, using the MySQL query cache is a good solution. To get a fast count, you have to use a counter table you create yourself and let your application update it according to the inserts and deletes it does. If an approximate row count is sufficient, SHOW TABLE STATUS can be used. See Section 14.3.14.1, “InnoDB Performance Tuning Tips”.

  • http://dev.mysql.com/doc/refman/5.5/en/innodb-restrictions.html
like image 23
scriptin Avatar answered Nov 09 '22 17:11

scriptin