So I was reading up on indexes and their implementation, and I stumbled upon this website that has a brief explanation of b-tree indexes: http://20bits.com/articles/interview-questions-database-indexes/ The b-tree index makes perfect sense for indexes that are only on a single column, but let's say I create an index with multiple columns, how then does the b-tree work? What is the value of each node in the b-tree? For example, if I have this table: <pre class="prettyprint"><code>table customer: id number name varchar phone_number varchar city varchar </code></pre> and I create an index on: (id, name, city) and then run the following query: <pre class="prettyprint"><code>SELECT id, name FROM customer WHERE city = 'My City'; </code></pre> how does this query utilize the multiple column index, or does it not utilize it unless the index is created as (city, id, name) or (city, name, id) instead?

With most implementations, the key is simply a longer key that includes all of the key values, with a separator. No magic there ;-) In your example the key values could look something like <pre class="prettyprint"> "123499|John Doe|Conway, NH" "32144|Bill Gates| Seattle, WA" </pre> One of the characteristics of these indexes with composite keys is that the intermediate tree nodes can be used in some cases to "cover" the query. For example, if the query is to find the Name and City given the ID, since the ID is first in the index, the index can search by this efficiently. Once in the intermediate node, it can "parse" the Name and City, from the key, and doesn't need to go to the leaf node to read the same. If however the query wanted also to display the phone number, then the logic would follow down the leaf when the full record is found.

what does a B-tree index on more than 1 column look like?

Tags:

database

sql-server

indexing

oracle

So I was reading up on indexes and their implementation, and I stumbled upon this website that has a brief explanation of b-tree indexes:

http://20bits.com/articles/interview-questions-database-indexes/

The b-tree index makes perfect sense for indexes that are only on a single column, but let's say I create an index with multiple columns, how then does the b-tree work? What is the value of each node in the b-tree?

For example, if I have this table:

table customer: id    number name   varchar phone_number   varchar city   varchar

and I create an index on: (id, name, city)

and then run the following query:

SELECT id, name    FROM customer  WHERE city = 'My City';

how does this query utilize the multiple column index, or does it not utilize it unless the index is created as (city, id, name) or (city, name, id) instead?

545

asked Oct 30 '09 05:10

Lawrence

2 Answers

With most implementations, the key is simply a longer key that includes all of the key values, with a separator. No magic there ;-)

In your example the key values could look something like

 "123499|John Doe|Conway, NH" "32144|Bill Gates| Seattle, WA"

One of the characteristics of these indexes with composite keys is that the intermediate tree nodes can be used in some cases to "cover" the query.

For example, if the query is to find the Name and City given the ID, since the ID is first in the index, the index can search by this efficiently. Once in the intermediate node, it can "parse" the Name and City, from the key, and doesn't need to go to the leaf node to read the same.

If however the query wanted also to display the phone number, then the logic would follow down the leaf when the full record is found.

answered Sep 30 '22 22:09

mjv

Imagine that the key is represented by a Python tuple (col1, col2, col3) ... the indexing operation involves comparing tuple_a with tuple_b ... if you have don't know which value of col1 and col2 that you are interested in, but only col3, then it would have to read the whole index ("full index scan"), which is not as efficient.

If you have an index on (col1, col2, col3), then you can expect that any RDBMS will use the index (in a direct manner) when the WHERE clause contains reference to (1) all 3 columns (2) both col1 and col2 (3) only col1.

Otherwise (e.g. only col3 in the WHERE clause), either the RDBMS will not use that index at all (e.g. SQLite), or will do a full index scan (e.g. Oracle) [if no other index is better].

In your specific example, presuming that id is a unique identifier of a customer, it is pointless to have it appear in an index (other than the index that your DBMS should set up for a primary key or column noted as UNIQUE).

answered Sep 30 '22 22:09

John Machin

Related questions
                            
                                MSSQL in python 2.7
                            
                                SQL Server Find What Jobs Are Running a Procedure
                            
                                How to insert a record with only default values?
                            
                                How do I decrease the size of my sql server log file?
                            
                                WITH VALUES TSQL
                            
                                Creating a UDF(User Define Function) if is does not exist and skipping it if it exists
                            
                                Possible to do a delete with a HAVING clause?
                            
                                Subtract two dates in SQL and get days of the result
                            
                                Should I use the CASCADE DELETE rule? [duplicate]
                            
                                How to speed up bulk insert to MS SQL Server using pyodbc
                            
                                Unable to cast TEXT to XML in SQL Server
                            
                                Log record changes in SQL server in an audit table
                            
                                How do I fix a "Performance counter registry hive consistency" when installing SQL Server R2 Express?
                            
                                SQL, How to Concatenate results?
                            
                                Get Hours and Minutes (HH:MM) from date
                            
                                Update values from one column in same table to another in SQL Server
                            
                                SQL Server Login error: Login failed for user 'NT AUTHORITY\SYSTEM'
                            
                                Sql server - log is full due to ACTIVE_TRANSACTION [duplicate]
                            
                                Is there an open source SQL Server DB compare tool? [closed]
                            
                                TSQL PIVOT MULTIPLE COLUMNS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With