I've hit the DB performance bottleneck, where now?

Tags:

I have some queries that are taking too long (300ms) now that the DB has grown to a few million records. Luckily for me the queries don't need to look at the majority of this data, that latest 100,000 records will be sufficient so my plan is to maintain a separate table with the most recent 100,000 records and run the queries against this. If anyone has any suggestions for a better way of doing this that would be great. My real question is what are the options if the queries did need to run against the historic data, what is the next step? Things I've thought of:

Upgrade hardware
Use an in memory database
Cache the objects manually in your own data structure

Are these things correct and are there any other options? Do some DB providers have more functionality than others to deal with these problems, e.g. specifying a particular table/index to be entirely in memory?

Sorry, I should've mentioned this, I'm using mysql.

I forgot to mention indexing in the above. Indexing have been my only source of improvement thus far to be quite honest. In order to identify bottlenecks I've been using maatkit for the queries to show whether or not indexes are being utilised.

I understand I'm now getting away from what the question was intended for so maybe I should make another one. My problem is that EXPLAIN is saying the query takes 10ms rather than 300ms which jprofiler is reporting. If anyone has any suggestions I'd really appreciate it. The query is:

select bv.* 
from BerthVisit bv 
inner join BerthVisitChainLinks on bv.berthVisitID = BerthVisitChainLinks.berthVisitID 
inner join BerthVisitChain on BerthVisitChainLinks.berthVisitChainID = BerthVisitChain.berthVisitChainID 
inner join BerthJourneyChains on BerthVisitChain.berthVisitChainID = BerthJourneyChains.berthVisitChainID 
inner join BerthJourney on BerthJourneyChains.berthJourneyID = BerthJourney.berthJourneyID 
inner join TDObjectBerthJourneyMap on BerthJourney.berthJourneyID = TDObjectBerthJourneyMap.berthJourneyID 
inner join TDObject on TDObjectBerthJourneyMap.tdObjectID = TDObject.tdObjectID 
where 
BerthJourney.journeyType='A' and 
bv.berthID=251860 and 
TDObject.headcode='2L32' and 
bv.depTime is null and 
bv.arrTime > '2011-07-28 16:00:00'

and the output from EXPLAIN is:

+----+-------------+-------------------------+-------------+---------------------------------------------+-------------------------+---------+------------------------------------------------+------+-------------------------------------------------------+
| id | select_type | table                   | type        | possible_keys                               | key                     | key_len | ref                                            | rows | Extra                                                 |
+----+-------------+-------------------------+-------------+---------------------------------------------+-------------------------+---------+------------------------------------------------+------+-------------------------------------------------------+
|  1 | SIMPLE      | bv                      | index_merge | PRIMARY,idx_berthID,idx_arrTime,idx_depTime | idx_berthID,idx_depTime | 9,9     | NULL                                           |  117 | Using intersect(idx_berthID,idx_depTime); Using where | 
|  1 | SIMPLE      | BerthVisitChainLinks    | ref         | idx_berthVisitChainID,idx_berthVisitID      | idx_berthVisitID        | 8       | Network.bv.berthVisitID                        |    1 | Using where                                           | 
|  1 | SIMPLE      | BerthVisitChain         | eq_ref      | PRIMARY                                     | PRIMARY                 | 8       | Network.BerthVisitChainLinks.berthVisitChainID |    1 | Using where; Using index                              | 
|  1 | SIMPLE      | BerthJourneyChains      | ref         | idx_berthJourneyID,idx_berthVisitChainID    | idx_berthVisitChainID   | 8       | Network.BerthVisitChain.berthVisitChainID      |    1 | Using where                                           | 
|  1 | SIMPLE      | BerthJourney            | eq_ref      | PRIMARY,idx_journeyType                     | PRIMARY                 | 8       | Network.BerthJourneyChains.berthJourneyID      |    1 | Using where                                           | 
|  1 | SIMPLE      | TDObjectBerthJourneyMap | ref         | idx_tdObjectID,idx_berthJourneyID           | idx_berthJourneyID      | 8       | Network.BerthJourney.berthJourneyID            |    1 | Using where                                           | 
|  1 | SIMPLE      | TDObject                | eq_ref      | PRIMARY,idx_headcode                        | PRIMARY                 | 8       | Network.TDObjectBerthJourneyMap.tdObjectID     |    1 | Using where                                           | 
+----+-------------+-------------------------+-------------+---------------------------------------------+-------------------------+---------+------------------------------------------------+------+---------------------------------------

7 rows in set (0.01 sec)

913

asked Jul 28 '11 14:07

James

4 Answers

Make sure all your indexes are optimized. Use explain on the query to see if it is using your indexes efficiently.
If you are doing some heavy joins then start thinking about doing this calculation in java.
Think of using other DBs such NoSQL. You maybe able to do some preprocessing and put data in Memcache to help you a little.

133

answered Nov 10 '22 16:11

Amir Raminfar

Considering a design change like this is not a good sign - I bet you still have plenty of performance to squeeze out using EXPLAIN, adjusting db variables and improving the indexes and queries. But you're probably past the point where "trying stuff" works very well. It's an opportunity to learn how to interpret the analyses and logs, and use what you learn for specific improvements to indexes and queries.

If your suggestion were a good one, you should be able to tell us why already. And note that this is a popular pessimization--

What is the most ridiculous pessimization you've seen?

answered Nov 10 '22 16:11

dkretz

Well, if you have optimised the database and queries, I'd say that rather than chop up the data, the next step is to look at:

a) the mysql configuration and make sure that it is making the most of the hardware

b) look at the hardware. You don't say what hardware you are using. You may find that replication is an option in your case if you can buy a two or three servers to divide up the reads from the database (writes have to be done to a central server, but reads can be read from any number of slaves).

answered Nov 10 '22 18:11

Jaydee

Instead of creating a separate table for latest results, think about table partitioning. MySQL has this feature built in since version 5.1

Just to make it clear: I am not saying this is THE solution for your issues. Just one thing you can try

answered Nov 10 '22 18:11

Mchl

Related questions
                            
                                What are your experiences regarding performance with amazon-rds
                            
                                How do forums show you unread topics?
                            
                                Portable version control?
                            
                                Using LAST_INSERT_ID() via PHP?
                            
                                Database design question
                            
                                mysql IF Else Statement
                            
                                Ruby on Rails Errno::EPIPE Broken pipe
                            
                                The purpose of SQL's EXISTS and NOT EXISTS
                            
                                MySQL: finding duplicates across multiple fields
                            
                                PHP eCommerce System [closed]
                            
                                How can I increase username length of PhpMyAdmin/mysql user account?
                            
                                What type of Join to use?
                            
                                How to implement a nested comment system?
                            
                                how to monitor database transaction?
                            
                                Large Data Sets - NoSQL, NewSQL, SQL..? Brain Fried
                            
                                MySQL GROUP BY date - how to return results when no rows
                            
                                how can I disable exponential notation when selecting a float from MySQL?
                            
                                Tracking User activity log - SQL vs NoSQL?
                            
                                Optimal column type for latitude and longitude on Rails and MySQL
                            
                                Ignore HTML characters when searching HTML stored content PHP/MySQL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

I've hit the DB performance bottleneck, where now?

Tags:

database

mysql

caching

database-design

database-performance