Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Slow on join. Any way to speed up

Tags:

date

join

php

mysql

I have 2 tables. 1 is music and 2 is listenTrack. listenTrack tracks the unique plays of each song. I am trying to get results for popular songs of the month. I'm getting my results but they are just taking too long. Below is my tables and query

430,000 rows

CREATE TABLE `listentrack` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `sessionId` varchar(50) NOT NULL,
    `url` varchar(50) NOT NULL,
    `date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    `ip` varchar(150) NOT NULL,
    `user_id` int(11) DEFAULT NULL,
     PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=731306 DEFAULT CHARSET=utf8

12500 rows

CREATE TABLE `music` (
   `music_id` int(11) NOT NULL AUTO_INCREMENT,
   `user_id` int(11) NOT NULL,
   `title` varchar(50) DEFAULT NULL,
   `artist` varchar(50) DEFAULT NULL,
   `description` varchar(255) DEFAULT NULL,
   `genre` int(4) DEFAULT NULL,
   `file` varchar(255) NOT NULL,
   `url` varchar(50) NOT NULL,
   `allow_download` int(2) NOT NULL DEFAULT '1',
   `plays` bigint(20) NOT NULL,
   `downloads` bigint(20) NOT NULL,
   `faved` bigint(20) NOT NULL,
   `dateadded` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
   PRIMARY KEY (`music_id`)
) ENGINE=MyISAM AUTO_INCREMENT=15146 DEFAULT CHARSET=utf8


SELECT COUNT(listenTrack.url) AS total, listenTrack.url 
FROM listenTrack
LEFT JOIN music ON music.url = listenTrack.url
WHERE DATEDIFF(DATE(date_created),'2009-08-15') = 0
GROUP BY listenTrack.url
ORDER BY total DESC
LIMIT 0,10

this query isn't very complex and the rows aren't too large, i don't think.

Is there any way to speed this up? Or can you suggest a better solution? This is going to be a cron job at the beggining of every month but I would also like to do by the day results as well.

Oh btw i am running this locally, over 4 min to run, but on prod it takes about 45 secs

like image 305
Khary Avatar asked Aug 18 '09 00:08

Khary


People also ask

Which JOIN is faster in MySQL?

performance - Mysql - LEFT JOIN way faster than INNER JOIN - Stack Overflow.

Why is SQL JOIN slow?

If you have a slow join, you're probably not using your database correctly. De-normalization should be done only after these other techniques have failed.

How can I improve my left JOIN performance?

First of all, indexes are required to speed up the query. If you do not have any, you probably should create some (depending on the query you perform). And if you do multiple LEFT JOINs, then you could (probably) separate them into different queries and this should make the application work a lot faster.


4 Answers

I'm more of a SQL Server guy but these concepts should apply.

I'd add indexes:

  1. On ListenTrack, add an index with url, and date_created
  2. On Music, add an index with url

These indexes should speed the query up tremendously (I originally had the table names mixed up - fixed in the latest edit).

like image 113
Jeff Siver Avatar answered Sep 27 '22 23:09

Jeff Siver


For the most part you should also index any column that is used in a JOIN. In your case, you should index both listentrack.url and music.url

@jeff s - An index music.date_created wouldnt help because you are running that through a function first so MySQL cannot use an index on that column. Often, you can rewrite a query so that the indexed referenced column is used statically like:

DATEDIFF(DATE(date_created),'2009-08-15') = 0

becomes

date_created >= '2009-08-15' and date_created < '2009-08-15'

This will filter down records that are from 2009-08-15 and allow any indexes on that column to be candidates. Note that MySQL might NOT use that index, it depends on other factors.

Your best bet is to make a dual index on listentrack(url, date_created) and then another index on music.url

These 2 indexes will cover this particular query.

Note that if you run EXPLAIN on this query you are still going to get a using filesort because it has to write the records to a temporary table on disk to do the ORDER BY.

In general you should always run your query under EXPLAIN to get an idea on how MySQL will execute the query and then go from there. See the EXPLAIN documentation:

http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

like image 32
Cody Caughlan Avatar answered Sep 28 '22 00:09

Cody Caughlan


Try creating an index that will help with the join:

CREATE INDEX idx_url ON music (url);
like image 21
VoteyDisciple Avatar answered Sep 27 '22 22:09

VoteyDisciple


I think I might have missed the obvious before. Why are you joining the music table at all? You do not appear to be using the data in that table at all and you are performing a left join which is not required, right? I think this table being in the query will make it much slower and will not add any value. Take all references to music out, unless the url inclusion is required, in which case you need a right join to force it to not include a row without a matching value.


I would add new indexes, as the others mention. Specifically I would add: music url listentrack date_created,url

This will improve your join a ton.

Then I would look at the query, you are forcing the system to perform work on each row of the table. It would be better to rephrase the date restriction as a range.

Not sure of the syntax off the top of my head: where '2009-08-15 00:00:00' <= date_created < 2009-08-16 00:00:00

That should allow it to rapidly use the index to locate the appropriate records. The combined two key index on music should allow it to find the records based on the date and URL. You should experiment, they might be better off going in the other direction url,date_created on the index.

The explain plan for this query should say "using index" on the right hand column for both. That means that it will not have to hit the data in the table to calculate your sums.

I would also check the memory settings that you have configured for MySQL. It sounds like you do not have enough memory allocated. Be very careful on the differences between server based settings and thread based settings. The server with a 10MB cache is pretty small, a thread with a 10MB cache can use a lot of memory quickly.

Jacob

like image 45
TheJacobTaylor Avatar answered Sep 27 '22 22:09

TheJacobTaylor