Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is SQL floating point sum affected by the order-by clause?

Unlike the pure mathematical real numbers, or even the rational numbers, the floating-point number types are not commutative-associative. This means, as is commonly known in numerical coding, the order of a list of floating-point numbers affects the value of the floating-point sum. It can be a good idea to sort the list to put the smaller numbers first before adding up.

An SQL table does have an order. And this can be specified in an order-by clause.

Does the SQL engine sum a field in the order given in the order-by clause?

Can I force the SQL engine to add up starting with the smallest first by sorting a list of positive floating point numbers in ascending order?

Can I force the engine to add up largest first? Or is the order of summation not dependent on the order of the tables? Or perhaps the summation order is related to the order-by order in a complicated and indeterminate manner?

I recognize that this might depend on the choice of SQL engine. My core interest is in SnowFlake.

like image 414
Bruce Avatar asked Oct 17 '21 00:10

Bruce


People also ask

How do you sum a float value in SQL Server?

You can use the ROUND or FORMAT function: SELECT ROUND(SUM(price), 2) FROM items; Alternatively you can specify precision when defining a column, e.g. FLOAT(5,2) .

How do I sum a column in SQL Server?

If you need to add a group of numbers in your table you can use the SUM function in SQL. This is the basic syntax: SELECT SUM(column_name) FROM table_name; The SELECT statement in SQL tells the computer to get data from the table.

How to use ORDER BY clause in SQL Server?

The SQL ORDER BY clause is used to sort the data in ascending or descending order, based on one or more columns. Some databases sort the query results in an ascending order by default. SELECT column-list FROM table_name [WHERE condition] [ORDER BY column1, column2, .. columnN] [ASC | DESC]; You can use more than one column in the ORDER BY clause.

How do you guarantee the Order of a column in SQL?

To guarantee a particular order, we make use of the ORDER BY clause. ORDER BY sorts on the basis of one or more columns. Records are returned in either ascending or descending order. If ASC or DESC keyword hasn’t been provided, then the results will be categorized in ascending order.

What is the basic syntax of the ORDER BY clause?

Syntax. The basic syntax of the ORDER BY clause is as follows −. SELECT column-list FROM table_name [WHERE condition] [ORDER BY column1, column2, .. columnN] [ASC | DESC]; You can use more than one column in the ORDER BY clause.

How do you get the Order of data in SQL?

In SQL, the SELECT statement does not return data in any specific order. To guarantee a particular order, we make use of the ORDER BY clause. ORDER BY sorts on the basis of one or more columns. Records are returned in either ascending or descending order.


Video Answer


1 Answers

Yes, floating point numbers are order specific, and yes, floating point numbers in Snowflake are impacted by this, they have an article somewhere talking about sort order instability and it's impact on things like ORDER-BY's, which is the opposite direction from your question.

And in theory yes, ordering a sub-select would allow for controlling for order problems. Unlike for example MS Sql Server which does not allow for ORDER-BY's on sub-select, Snowflake does allow you to do them. The real issue is do they always respect it. And I would tend to believe they don't, as most of the operations are parallelable, or which summing is one that is normally "safe", thus I would suspect you will not be able to force the order to be respected.

One idea would I had that I though would work, was to write the values to a temporary table, with the order by on that, but then you get the problem of read from that will equally likely get parallelized. You could width_bucket to chunk the values into batches and then sum those, and then sum the sums. but again within the chunks you will still get instability.

At this point is numerical stability is your primary concern and the above hackery does not meet your needs I will flip to NUMBER and use the more decimal place form.

like image 156
Simeon Pilgrim Avatar answered Oct 23 '22 12:10

Simeon Pilgrim