I have a table structure similar the following:
create table MAIL (
ID int,
FROM varchar,
SENT_DATE date
);
create table MAIL_TO (
ID int,
MAIL_ID int,
NAME varchar
);
and I need to run the following query:
select m.ID
from MAIL m
inner join MAIL_TO t on t.MAIL_ID = m.ID
where m.SENT_DATE between '07/01/2010' and '07/30/2010'
and t.NAME = '[email protected]'
Is there any way to design indexes such that both of the conditions can use an index? If I put an index on MAIL.SENT_DATE and an index on MAIL_TO.NAME, the database will choose to use either one of the indexes or the other, not both. After filtering by the first condition the database always has to do a full scan of the results for the second condition.
Yep, indexed views are the answer. You can join the two tables in a view and create a unique clustered index on the view for the required columns.
A database composite index or multi-column index is an index that is based on several columns.
The Nasdaq Composite Index comprises of all Nasdaq domestic and international stocks listed on the Nasdaq Stock Market while the Nasdaq 100 index is a large-cap growth index and includes 100 of the top domestic and international non-financial companies based on market capitalization.
Oracle can use both indices. You just don't have the right two indices.
Consider: if the query plan uses your index on mail.sent_date
first, what does it get from mail
? It gets all the mail.id
s where mail.sent_date
is within the range you gave in your where
clause, yes?
So it goes to mail_to
with a list of mail.id
s and the mail.name
you gave in your where
clause. At this point, Oracle decides that it's better to scan the table for matching mail_to.mail_id
s rather than use the index on mail_to.name
.
Indices on varchars are always problematic, and Oracle really prefers full table scans. But if we give Oracle an index containing the columns it really wants to use, and depending on total table rows and statistics, we can get it to use it. This is the index:
create index mail_to_pid_name on mail_to( mail_id, name ) ;
This works where an index just on name
doesn't, because Oracle's not looking just for a name, but for a mail_id
and a name
.
Conversely, if the cost-based analyzer determines it's cheaper to go to table mail_to
first, and uses your index on mail_to.name
, what doe sit get? A bunch of mail_to_.mail_id
s to look up in mail
. It needs to find rows with those ids and certain sent_dates, so:
create index mail_id_sentdate on mail( sent_date, id ) ;
Note that in this case I've put sent_date
first in the index, and id
second. (This is more an intuitive thing.)
Again, the take home point is this: in creating indices, you have to consider not just the columns in your where
clause, but also the columns in your join conditions.
Update
jthg: yes, it always depends on how the data is distributed. And on how many rows are in the table: if very many, Oracle will do a table scan and hash join, if very few it will do a table scan. You might reverse the order of either of the two indices. By putting sent_date first in the second index, we eliminate most needs for an index solely on sent_date
.
A materialized view would allow you to index the values, assuming the stringent materialized view criteria is met.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With