COUNT and GROUP BY on text fields seems slow

Tags:

I'm building a MySQL database which contains entries about special substrings of DNA in species of yeast. My table looks like this:

+--------------+---------+------+-----+---------+-------+
| Field        | Type    | Null | Key | Default | Extra |
+--------------+---------+------+-----+---------+-------+
| species      | text    | YES  | MUL | NULL    |       |
| region       | text    | YES  | MUL | NULL    |       |
| gene         | text    | YES  | MUL | NULL    |       |
| startPos     | int(11) | YES  |     | NULL    |       |
| repeatLength | int(11) | YES  |     | NULL    |       |
| coreLength   | int(11) | YES  |     | NULL    |       |
| sequence     | text    | YES  | MUL | NULL    |       |
+--------------+---------+------+-----+---------+-------+

There are approximately 1.8 million records. In one type of query I want to see how many DNA substrings are associated with each type of species and region, so I issue this query:

select species, region, count(*) group by species, region;

The species and region columns have only two possible entries (conserved/scer for species, and promoter/coding for region) yet this query takes about 30 seconds.

Is this a normal amount of time to expect for this type of query given the size of the table? Is it slow because I'm using text fields instead of simple integer or boolean values (I prefer text fields as several non-CS researchers will be using the DB). Any other ideas and suggestions would be welcome.

Please excuse if this is a boneheaded question, I am an SQL neophyte.

P.S. I've also seen this question but the proposed solution doesn't seem relevant for what I'm doing.

EDIT: Converting those fields to VARCHARs reduced the runtime to ~2.5 seconds. Note I also timed it against ENUMs which had a similar timing.

788

asked Jul 22 '10 02:07

Rich

1 Answers

Why're all your string based columns defined as TEXT? If you read the performance comparison, you'll see that TEXT was ~3x slower than a VARCHAR column using identical indexing: http://forums.mysql.com/read.php?24,105964,105964

156

answered Sep 25 '22 23:09

OMG Ponies

Related questions
                            
                                Improve PostgresSQL aggregation query performance
                            
                                How do I create unique constraint for multiple columns?
                            
                                What's optimal? UNION vs WHERE IN (str1, str2, str3)
                            
                                REPLACE and Unicode characters in SQL
                            
                                Choosing ISAM rather than SQL
                            
                                ASP.NET MVC TDD with LINQ and SQL database
                            
                                SQL to transpose row pairs to columns in MS ACCESS database
                            
                                How can I make a stored procedure return a "dataset" using a parameter I pass?
                            
                                Migrating MySQL to a table with different structure
                            
                                MS ACCESS: How can i count distinct value using access query?
                            
                                Query Multiple databases with single ado.net query
                            
                                TSQL - Help with UNPIVOT
                            
                                Is there a way to run a stored procedure at predefined intervals?
                            
                                Is it bad to not use a DB but use in memory objects?
                            
                                Difference between Cluster and Non-cluster index in SQL
                            
                                Best way to store sales tax information
                            
                                SQL Join Tables
                            
                                Algorithm: Build a recommendation for movies you might like
                            
                                Is my understanding of "select distinct" correct?
                            
                                Stored procedure returning the result of an UPDATE

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

COUNT and GROUP BY on text fields seems slow

Tags:

sql

database

mysql

aggregate-functions

query-optimization

Rich

People also ask

1 Answers

OMG Ponies

Recent Activity

Donate For Us