Since we can get the row number assigned using the row_number()
and if we want to find the rank of each row without skipping of any number within the partition using dense_rank()
, why do we need rank()
function, I can't think of any use case that rank()
function is offering that either dense_rank()
or row_number()
cannot fulfil that.
Is there any use case where rank()
is best suited?
When applying either RANK
or DENSE_RANK
to a column which has no ties, they would both result in the same series which would be generated by ROW_NUMBER
. The difference between RANK
and DENSE_RANK
appears in the presence of ties, and is subtle. Consider the following table, along with row number, rank, and dense rank values:
SALARY | ROW_NUMBER | RANK | DENSE_RANK
1000 | 1 | 1 | 1
1500 | 2 | 2 | 2
1500 | 3 | 2 | 2
2000 | 4 | 4 | 3
2200 | 5 | 5 | 4
2500 | 6 | 6 | 5
2500 | 7 | 6 | 5
2500 | 8 | 6 | 5
3000 | 9 | 9 | 6
Hopefully you can see above that when a tie of two or more records appears, both RANK
and DENSE_RANK
assign the same rank to all records with the same value. However, where they differ is that RANK
continues the rank count in line with the ROW_NUMBER
series, while DENSE_RANK
does not, and instead continues the rank count with the next value after the duplicate rank.
Now getting to your question, whether you choose RANK
or DENSE_RANK
depends on your requirements. For example, if you were reporting winners, and always needed to report a first, second, and third place, regardless of ties for each place, you would use DENSE_RANK
. Otherwise, you would use RANK
, which might mean that there is no second or third place. If you were certain that there could never be duplicates, then you could just use ROW_NUMBER
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With