When to choose rank() over dense_rank() or row_number()

Question

Since we can get the row number assigned using the row_number() and if we want to find the rank of each row without skipping of any number within the partition using dense_rank(), why do we need rank() function, I can't think of any use case that rank() function is offering that either dense_rank() or row_number() cannot fulfil that.

Is there any use case where rank() is best suited?

Tim Biegeleisen · Accepted Answer

When applying either RANK or DENSE_RANK to a column which has no ties, they would both result in the same series which would be generated by ROW_NUMBER. The difference between RANK and DENSE_RANK appears in the presence of ties, and is subtle. Consider the following table, along with row number, rank, and dense rank values:

SALARY | ROW_NUMBER | RANK | DENSE_RANK
1000   | 1          | 1    | 1
1500   | 2          | 2    | 2
1500   | 3          | 2    | 2
2000   | 4          | 4    | 3
2200   | 5          | 5    | 4
2500   | 6          | 6    | 5
2500   | 7          | 6    | 5
2500   | 8          | 6    | 5
3000   | 9          | 9    | 6

Hopefully you can see above that when a tie of two or more records appears, both RANK and DENSE_RANK assign the same rank to all records with the same value. However, where they differ is that RANK continues the rank count in line with the ROW_NUMBER series, while DENSE_RANK does not, and instead continues the rank count with the next value after the duplicate rank.

Now getting to your question, whether you choose RANK or DENSE_RANK depends on your requirements. For example, if you were reporting winners, and always needed to report a first, second, and third place, regardless of ties for each place, you would use DENSE_RANK. Otherwise, you would use RANK, which might mean that there is no second or third place. If you were certain that there could never be duplicates, then you could just use ROW_NUMBER.

When to choose rank() over dense_rank() or row_number()

Tags:

sql

sql-server

hive

hiveql

Ankur

1 Answers

Tim Biegeleisen

Recent Activity

Donate For Us

When to choose rank() over dense_rank() or row_number()

Tags:

sql

sql-server

hive

hiveql

Ankur

1 Answers

Tim Biegeleisen

Related questions

Recent Activity

Donate For Us