Is the time complexity of the Oracle MAX function O(1), O(log n) or O(n) with respect to the number of rows in a table?
MAX is an aggregate function that evaluates the maximum of an expression over a set of rows (see Aggregates (set functions)). MAX is allowed only on expressions that evaluate to built-in data types (including CHAR, VARCHAR, DATE, TIME, CHAR FOR BIT DATA, etc.).
The MAX() function is used with the WHERE clause to gain further insights from our data. In SQL, the MAX() function computes the highest or maximum value of numeric values in a column.
SQL MIN() and MAX() Functions The MIN() function returns the smallest value of the selected column. The MAX() function returns the largest value of the selected column.
Try using this SQL SELECT statement: SELECT * FROM employees WHERE department_id=30 AND salary = (SELECT MAX(salary) FROM employees WHERE department_id=30); This will return the employee information for only the employee in department 30 that has the highest salary.
If you have a B-tree index on the column then finding the maximum value is O(log(n)) because the answer will be the last (or first) row of the index. Values are stored in the deepest nodes of the B-tree which has a height O(log(n)).
Without an index it is O(n) because all rows must be read to determine the maximum value.
Note: The O(n) notation ignores constants but in the real world these constants cannot be ignored. The difference between reading from disk and reading from memory is several orders of magnitude. Accessing the first value of an index is likely to be performed mostly in RAM, whereas a full table scan of a huge table will need to read mostly from disk.
Realistically, it's hard to say without specifying a query, a table definition, and a query plan.
If you have a table that has no index on the column you're computing the MAX
on, Oracle will have to do a full table scan. That is going to be O(n) since you've got to scan every block in the table. You can see that by looking at the query plan.
We'll generate a table with 100,000 rows and ensure that the rows are reasonably large using a CHAR(1000)
column
SQL> create table foo( col1 number, col2 char(1000) );
Table created.
SQL> insert into foo
2 select level, lpad('a',1000)
3 from dual
4 connect by level <= 100000;
100000 rows created.
Now, we can look at the plan for the basic MAX
operation. This is doing a full table scan (an O(n) operation)
SQL> set autotrace on;
SQL> select max(col1)
2 from foo;
MAX(COL1)
----------
100000
Execution Plan
----------------------------------------------------------
Plan hash value: 1342139204
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 4127 (1)| 00:00:50 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | TABLE ACCESS FULL| FOO | 106K| 1350K| 4127 (1)| 00:00:50 |
---------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
29 recursive calls
1 db block gets
14686 consistent gets
0 physical reads
176 redo size
527 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
If you create an index on the column you're computing the MAX
of, Oracle can do a MIN/MAX scan
on the index. That is an O(log n) operation if that's the plan the optimizer chooses. Of course, as a practical matter, this is functionally an O(1) operation because the height of an index is never realistically going to exceed 4 or 5-- the constant terms here are going to dominate.
SQL> create index idx_foo_col1
2 on foo( col1 );
Index created.
SQL> select max(col1)
2 from foo;
MAX(COL1)
----------
100000
Execution Plan
----------------------------------------------------------
Plan hash value: 817909383
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | INDEX FULL SCAN (MIN/MAX)| IDX_FOO_COL1 | 1 | 13 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
83 consistent gets
1 physical reads
0 redo size
527 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
But then things get harder. Both MIN
and MAX
have the same O(log n) behavior individually. But if you have both MIN
and MAX
in the same query, suddenly you're back to an O(n) operation. Oracle (as of 11.2) hasn't implemented an option grab both the first block and the last block of an index
SQL> ed
Wrote file afiedt.buf
1 select min(col1), max(col1)
2* from foo
SQL> /
MIN(COL1) MAX(COL1)
---------- ----------
1 100000
Execution Plan
----------------------------------------------------------
Plan hash value: 1342139204
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 4127 (1)| 00:00:50 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | TABLE ACCESS FULL| FOO | 106K| 1350K| 4127 (1)| 00:00:50 |
---------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
14542 consistent gets
0 physical reads
0 redo size
601 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Of course, in subsequent versions of Oracle, this optimization might be implemented and this would go back to being an O(log n) operation. Of course, you can also rewrite the query to get a different query plan that goes back to being O(log n)
SQL> ed
Wrote file afiedt.buf
1 select (select min(col1) from foo) min,
2 (select max(col1) from foo) max
3* from dual
SQL>
SQL> /
MIN MAX
---------- ----------
1 100000
Execution Plan
----------------------------------------------------------
Plan hash value: 3561244922
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
| 2 | INDEX FULL SCAN (MIN/MAX)| IDX_FOO_COL1 | 1 | 13 | 2 (0)| 00:00:01 |
| 3 | SORT AGGREGATE | | 1 | 13 | | |
| 4 | INDEX FULL SCAN (MIN/MAX)| IDX_FOO_COL1 | 1 | 13 | 2 (0)| 00:00:01 |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
7 recursive calls
0 db block gets
166 consistent gets
0 physical reads
0 redo size
589 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With