Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get MySQL to use an INDEX for view query?

Tags:

indexing

mysql

I'm working on a web project with MySql database on Java EE. We needed a view to summarize data from 3 tables with over 3M rows overall. Each table was created with index. But I haven't found out a way to take advantages in the indexes in the conditional select statement retrieval from the view that we created with [group by].

I've getting suggestions from people that using views in MySql is not a good idea. Because you can't create index for views in mysql like in oracle. But in some test that I took, indexes can be used in view select statement. Maybe I've created those views in a wrong way.

I'll use a example to describe my problem.

We have a table that records data for high scores in NBA games, with index on column [happend_in]

CREATE  TABLE `highscores` (    `tbl_id` int(11) NOT NULL auto_increment,    `happened_in` int(4) default NULL,    `player` int(3) default NULL,    `score` int(3) default NULL,    PRIMARY KEY  (`tbl_id`),    KEY `index_happened_in` (`happened_in`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; 

insert data(8 rows)

INSERT INTO highscores(happened_in, player, score) VALUES (2006, 24, 61),(2006, 24, 44),(2006, 24, 81), (1998, 23, 51),(1997, 23, 46),(2006, 3, 55),(2007, 24, 34), (2008, 24, 37); 

then I create a view to see the highest score that Kobe Bryant got in each year

CREATE OR REPLACE VIEW v_kobe_highScores AS    SELECT player, max(score) AS highest_score, happened_in    FROM highscores    WHERE player = 24    GROUP BY happened_in; 

I wrote a conditional statement to see the highest score that kobe got in 2006;

select * from v_kobe_highscores where happened_in = 2006; 

When I explain it in toad for mysql, I found out that mysql have scan all rows to form the view, then find data with condition in it, without using index on [happened_in].

explain select * from v_kobe_highscores where happened_in = 2006; 

explain result

The view that we use in our project is built among tables with millions of rows. Scanning all the rows from table in every view data retrieval is unacceptable. Please help! Thanks!

@zerkms Here is the result I tested on real-life. I don't see much differences between. I think @spencer7593 has the right point. The MySQL optimizer doesn't "push" that predicate down in the view query. real-life test

like image 529
Roger Ray Avatar asked Dec 19 '12 03:12

Roger Ray


People also ask

Can you put indexes on views MySQL?

It is not possible to create an index on a view. Indexes can be used for views processed using the merge algorithm. However, a view that is processed with the temptable algorithm is unable to take advantage of indexes on its underlying tables (although indexes can be used during generation of the temporary tables).

Can we use index on view?

Creating a unique clustered index on a view improves query performance because the view is stored in the database in the same way a table with a clustered index is stored. The query optimizer may use indexed views to speed up the query execution.

Why is my query not using index?

Answer: Oracle SQL not using an index is a common complaint, and it's often because the optimizer thinks that a full-scan is cheaper than index access.

How do I force an index query in MySQL?

In case the query optimizer ignores the index, you can use the FORCE INDEX hint to instruct it to use the index instead. In this syntax, you put the FORCE INDEX clause after the FROM clause followed by a list of named indexes that the query optimizer must use.


2 Answers

How do you get MySQL to use an index for a view query? The short answer, provide an index that MySQL can use.

In this case, the optimum index is likely a "covering" index:

... ON highscores (player, happened_in, score) 

It's likely that MySQL will use that index, and the EXPLAIN will show: "Using index" due to the WHERE player = 24 (an equality predicate on the leading column in the index. The GROUP BY happened_id (the second column in the index), may allow MySQL to optimize that using the index to avoid a sort operation. Including the score column in the index will allow the query to satisfied entirely from the index, without having to visit (lookup) the data pages referenced by the index.

That's the quick answer. The longer answer is that MySQL is very unlikely to use an index with leading column of happened_id for the view query.


Why the view causes a performance issue

One of the issues you have with the MySQL view is that MySQL does not "push" the predicate from the outer query down into the view query.

Your outer query specifies WHERE happened_in = 2006. The MySQL optimizer does not consider the predicate when it runs the inner "view query". That query for the view gets executed separately, before the outer query. The resultset from the execution of that query get "materialized"; that is, the results are stored as an intermediate MyISAM table. (MySQL calls it a "derived table", and that name they use makes sense, when you understand the operations that MysQL performs.)

The bottom line is that the index you have defined on happened_in is not being used by MySQL when it rusn the query that forms the view definition.

After the intermediate "derived table" is created, THEN the outer query is executed, using that "derived table" as a rowsource. It's when that outer query runs that the happened_in = 2006 predicate is evaluated.

Note that all of the rows from the view query are stored, which (in your case) is a row for EVERY value of happened_in, not just the one you specify an equality predicate on in the outer query.

The way that view queries are processed may be "unexpected" by some, and this is one reason that using "views" in MySQL can lead to performance problems, as compared to the way view queries are processed by other relational databases.


Improving performance of the view query with a suitable covering index

Given your view definition and your query, about the best you are going to get would be a "Using index" access method for the view query. To get that, you'd need a covering index, e.g.

... ON highscores (player, happened_in, score). 

That's likely to be the most beneficial index (performance wise) for your existing view definition and your existing query. The player column is the leading column because you have an equality predicate on that column in the view query. The happened_in column is next, because you've got a GROUP BY operation on that column, and MySQL is going to be able to use this index to optimize the GROUP BY operation. We also include the score column, because that is the only other column referenced in your query. That makes the index a "covering" index, because MySQL can satisfy that query directly from index pages, without a need to visit any pages in the underlying table. And that's as good as we're going to get out of that query plan: "Using index" with no "Using filesort".


Compare performance to standalone query with no derived table

You could compare the execution plan for your query against the view vs. an equivalent standalone query:

SELECT player      , MAX(score) AS highest_score      , happened_in  FROM highscores WHERE player = 24   AND happened_in = 2006 GROUP    BY player     , happened_in 

The standalone query can also make use of a covering index e.g.

... ON highscores (player, happened_in, score) 

but without a need to materialize an intermediate MyISAM table.


I am not sure that any of the previous provides a direct answer to the question you were asking.

Q: How do I get MySQL to use an INDEX for view query?

A: Define a suitable INDEX that the view query can use.

The short answer is provide a "covering index" (index includes all columns referenced in the view query). The leading columns in that index should be the columns that are referenced with equality predicates (in your case, the column player would be a leading column because you have a player = 24 predicate in the query. Also, the columns referenced in the GROUP BY should be leading columns in the index, which allows MySQL to optimize the GROUP BY operation, by making use of the index rather than using a sort operation.

The key point here is that the view query is basically a standalone query; the results from that query get stored in an intermediate "derived" table (a MyISAM table that gets created when a query against the view gets run.

Using views in MySQL is not necessarily a "bad idea", but I would strongly caution those who choose to use views within MySQL to be AWARE of how MySQL processes queries that reference those views. And the way MySQL processes view queries differs (significantly) from the way view queries are handled by other databases (e.g. Oracle, SQL Server).

like image 126
spencer7593 Avatar answered Sep 20 '22 10:09

spencer7593


Creating the composite index with player + happened_in (in this particular order) columns is the best you can do in this case.

PS: don't test mysql optimizer behaviour on such small amount of rows, because it's likely to prefer fullscan over indexes. If you want to see what will happen in real life - fill it with real life-alike amount of data.

like image 44
zerkms Avatar answered Sep 19 '22 10:09

zerkms