Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does ORDER BY ABS(CHECKSUM(NEWID())) work?

I have stumbled upon a random code ID generator and I can't figure out how it works.

SELECT TOP (5) c1
FROM
  (
VALUES
  ('A'), ('B'), ('C'), ('D'), ('E'), ('F'), ('G'),
  ('3'), ('4'), ('5'), ('6'), ('7'), ('8'), ('9')
  ) AS T1(c1)
ORDER BY ABS(CHECKSUM(NEWID()))

I have looked at the following:

  • select NEWID() -> E8E142CC-A918-4776-AA99-2D33DC80FE28
  • select CHECKSUM('E8E142CC-A918-4776-AA99-2D33DC80FE28') -> -2089106226
  • select ABS(-2089106226) -> 2089106226

If I then try and do an "Order by 2089106226" it obviously fails

So what is "ABS(CHECKSUM(NEWID()))" returning and what is this type of ordering called?

like image 514
Bertus Kruger Avatar asked Feb 19 '26 06:02

Bertus Kruger


1 Answers

An ORDER BY clause doesn't have to sort on a specific column as you've discovered, but rather it can contain any arbitrary expression. I don't believe there is any special name for this, it's just a dynamic value in the order_by_expression part of the ORDER BY documented here

A non-column expression in the ORDER BY might be something like

ORDER BY a_column % 5

to sort rows based on a column's value modulus 5, or like

ORDER BY CASE WHEN LEFT(a_column, 1) = 'Z' THEN 0 ELSE 1 END ASC, a_column ASC

to force all the values in column beginning with Z to sort ahead of all others, but otherwise be alphabetical, by ascribing a 0 to Z rows and 1 otherwise (where zero sorts ahead of one).

In this case, the expression is a common method of randomizing row order in T-SQL:

ABS(CHECKSUM(NEWID()))

The NEWID() function will be called for each row returned by the SELECT statement, which results in 5 separate GUID values. From there, each of those 5 is passed to CHECKSUM() which returns an index hash value which happens to be an integer. ABS() finally forces it to be a positive integer.

So the values that are sent ultimately to ORDER BY is just a list of integers derived from multiple calls to NEWID(), and ORDER BY has no problem sorting integers. In the end it isn't really different from sorting a column of integer values - it's just that these are generated at query time.

If you run this demonstration on SQLFiddle a few times, you'll get a different sort order for the otherwise incrementing id on each execution. So it looks like the reason the author of the random code generator SQL in question used that ORDER BY is to shuffle a random set of 5 characters chosen from the VALUES () list. If you run the code generator SQL and comment out the ORDER BY, it will always return the rowset A,B,C,D,E. With the ORDER BY in place, it returns a randomized set.

like image 107
Michael Berkowski Avatar answered Feb 20 '26 21:02

Michael Berkowski



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!