Big-Oh Performance of an Inner Join on Two Indexes

Tags:

I am trying to figure out the Big-Oh performance of the following query:

SELECT * 
FROM table1 INNER JOIN table2 ON table1.a = table2.b
GROUP BY table1.a

table1.a is the primary key of the table. table2.b has a non-unique index on it.

My thought is since each index can be searched in O(log n), then this query performs in O(log n * log m) where n is the number of rows in table 1 and m is the number of rows in table 2.

Any input would be appreciated.

629

asked May 16 '13 15:05

tlovett1

2 Answers

Your thinking is a bit off. An index can be searched in O(log n) for a single lookup. Your query would presumably be doing "n" or "m" of these.

Let me assume that the query processes by joining the two tables together by scanning one table and looking up the values in the other. It then does sort-based aggregation for the order by.

The "matching" piece of the query is then the larger of:

O(n log m)
O(m log n)

This assumes that the query engine decides to scan one table and look up values in the index in the other.

To continue, once you look up values, you need to fetch the data in the pages for the table where you used the index. Fetching data is technically O(n). Our estimate so far is O((n log m) + n + ).

The aggregation should be O(n log n) for a sort followed by a scan. But, how many records do you have for the aggregation? You could have as many as n*m matches to the join. Or, as few as 0 (it is an inner join).

This is big-O, which is an upper bound. So, we have to use the bigger estimate. This results in O((n*m)log(n*m)) for the aggregation, which would dominate other terms. The big-O would be O((n*m) log(n*m)).

175

answered Oct 19 '22 07:10

Gordon Linoff

The performance of the query depends on how the SQL statement is executed internally.

Maybe you could look into EXPLAIN (for MySQL: http://dev.mysql.com/doc/refman/5.1/en/explain.html) here to get more info on how your query gets executed as this could yield more accurate results than looking at Big-Oh.

Btw: Gordon Linoff's answer looks good if you're really looking for Big-Oh!

answered Oct 19 '22 06:10

lorey

Related questions
                            
                                What are some of the safest ways to connect to a database with PHP? [duplicate]
                            
                                Best practice for a mysql data versioning system
                            
                                MySQL select MAX(datetime) not returning max value
                            
                                How can I use one database connection object in whole application? [duplicate]
                            
                                Getting SQL query string from DbCommand with Parameters
                            
                                Can't install Python-MySQL on OS X 10.10 Yosemite
                            
                                MySQL: integer Index vs Varchar Index
                            
                                Hiding true database object ID in url's
                            
                                How to use orchestral/tenanti in Laravel 5 to build a multi tenant application with multiple databases?
                            
                                Spring Boot connect Mysql and MongoDb
                            
                                Primary key versus key
                            
                                MySQL: How to add a column if it doesn't already exist?
                            
                                Mysql turns ' into â€™?
                            
                                What is the socket declaration for, in Ruby on Rails database.yml?
                            
                                MySQL with Haskell
                            
                                Does a MySQL multi-row insert grab sequential autoincrement IDs?
                            
                                Command-line script PHP does not run
                            
                                MySQL the right syntax to use near '' at line 1 error
                            
                                MySQL - Get a counter for each duplicate value
                            
                                Creating a search form in PHP to search a database? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Big-Oh Performance of an Inner Join on Two Indexes

Tags:

performance

sql

join

mysql

tlovett1

People also ask

2 Answers

Gordon Linoff

lorey

Recent Activity

Donate For Us