Can a mysql prefix index be used just like a normal index?
If there was some TEXT
column and the length of the prefix index on it was e.g. 1 and the query was:
SELECT * FROM table WHERE textcol = 'ab'
Would it just give me all rows starting with 'a' or would it check the whole column value?
In general, I'm curious to know if there's any caveats when using prefix indexes. Not regarding performance, more if any queries would have to be written differently or whether the client would have to do extra logic.
Prefix Index This structure is an ordered data structure that can be stored sorted by specified columns. On this data structure, it will be very efficient to perform lookups with sorted columns as a condition. In the Aggregate, Unique and Duplicate data models.
MySQL has three types of indexes: INDEX, UNIQUE (which requires each row to have a unique value), and PRIMARY KEY (which is just a particular UNIQUE index).
Introduction to MySQL Prefix IndexMySQL allows you to optionally create column prefix key parts for CHAR , VARCHAR , BINARY , and VARBINARY columns. If you create indexes for BLOB and TEXT columns, you must specify the column prefix key parts.
MySQL requires that foreign key columns be indexed; if you create a table with a foreign key constraint but no index on a given column, an index is created. Information about foreign keys on InnoDB tables can also be found in the INNODB_FOREIGN and INNODB_FOREIGN_COLS tables, in the INFORMATION_SCHEMA database.
If you think about it for a moment, MySQL will still give you the correct answer, even with no index... it just won't be as fast... so, yes, you'll still get the correct answer with a prefix index.
The performance will be lower because after matching the "possible" rows with the index, the server will go to the row data and further filter the results against the WHERE
clause. Two steps instead of one, but nothing the application needs to care about.
The caveats include the fact that a prefix index won't be used by the optimizer for some operations, like sorting or grouping, because it doesn't cover enough of the column data for those purposes.
A prefix index isn't sorted beyond the length of the prefix. If your query uses a full index to find rows, you'll often find that the rows are returned sorted in index order implicitly. If your application expects this behavior then it is of course expecting something it should not expect, because the order in which rows are returned is undefined unless you explicitly ORDER BY
. Don't rely on coincidental behavior, in any query, because not only will the rows matched by a prefix index will not be necessarily in any particular order... but in fact the order of any result set where ordering is not explicit is subject to change at any time.
And, a prefix index can't be used as a covering index. A covering index refers to the case where all of the columns in a SELECT
happen to be included together in one index (plus optionally the primary key, since it's always there too). The optimizer will read the data directly from the index, instead of using the index to identify rows to look up in the main table data. Even if the index can't be used to look up the matching rows, the optimizer will do a full scan of only a covering index, instead of doing a full scan of the entire table, saving I/O and time. (This capability, by the way, should be enough reason to select the columns you want, instead of the lazy SELECT *
-- it potentially opens up some more efficient query plans). A prefix index can't be used for this, either.
But aside from performance and optimizations and queries that implicitly do something you expect (which you should not be expecting), there is no logic-related caveat that comes to mind with a prefix index. The result will still be correct.
All too often, a "prefix index" is useless. I have seen cases where it ignores a prefix index when I thought it could use it.
If your TEXT
field is never bigger than 255 characters, change it to VARCHAR(255)
(or smaller); then use a real index, not a prefix index.
Would it just give me all rows starting with 'a' or would it check the whole column value?
Assuming you have INDEX(textcol(1))
, then it would have to scan all the rows starting with a
to find the row(s) with textcol = 'ab'
and deliver only those rows. Note that it is a performance question, not a correctness question (as @Michaelsqlbot so eloquently spelled out).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With