Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating an appropriate index for a frequently used query in SQL Server

In my application I have two queries which will be quite frequently used. The Where clauses of these queries are the following:

WHERE FieldA = @P1 AND (FieldB = @P2 OR FieldC = @P2)

and

WHERE FieldA = @P1 AND FieldB = @P2

P1 and P2 are parameters entered in the UI or coming from external datasources.

  • FieldA is an int and highly non-unique, means: only two, three, four different values in a table with say 20000 rows
  • FieldB is a varchar(20) and is "almost" unique, there will be only very few rows where FieldB might have the same value
  • FieldC is a varchar(15) and also highly distinct, but not as much as FieldB
  • FieldA and FieldB together are unique (but do not form my primary key, which is a simple auto-incrementing identity column with a clustered index)

I'm wondering now what's the best way to define an index to speed up specifically these two queries. Shall I define one index with...

FieldB (or better FieldC here?)
FieldC (or better FieldB here?)
FieldA

... or better two indices:

FieldB
FieldA

and

FieldC
FieldA

Or are there even other and better options? What's the best way and why?

Thank you for suggestions in advance!

Edit:

Just as an info to other readers: Here was another answer which has been deleted now. Actually the answer seemed very useful to me. The recommendation was to create two indices (according to my second option above) and to reformulate the first query by using a UNION of two select statements (one with WHERE FieldA = @P1 AND FieldB = @P2 and one with WHERE FieldA = @P1 AND FieldC = @P2) instead of OR to benefit from both indices (which wouldn't be the case with the OR-operator).

Edit2:

The statement that with OR the indexes are not used and that a UNION is preferable seems to be wrong - at least according to my own tests (see my own answer below).

like image 274
Slauma Avatar asked Jun 13 '10 16:06

Slauma


1 Answers

Extending Remus' (edit: now deleted) answer...

  • if @p2 is varchar(15) then you can't compare against FieldB reliably, it's varchar(20)
  • if @p2 is varchar(20) then FieldC will be converted to varchar(20) and not use an index (or at best scan it)
  • if @p1 only has 2, 3, 4 values then why not tinyint and reduce table/index size?

I wouldn't bother with indexes until you resolve this datatype precedence issue: this is on top of the OR clause issue.

Finally, a column is unique or non-unique: there is no in between. Statistics help here with selectivity, but it's irrelevant.

I would reverse the indexes from Remus' answer to be FieldB, FieldA (and unique) and FieldC, FieldA because of FieldA's selectivity

Edit, after comments: you can't compare the use of @p2 against the use of constant strings.

like image 175
gbn Avatar answered Oct 21 '22 00:10

gbn