Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sql server rewrites my query incorrectly?

Tags:

sql

sql-server

There is a dirty data in input. We are trying to cleanup dataset and then make some calculations on cleared data.

declare @t table (str varchar(10))
insert into @t select '12345' union all select 'ABCDE' union all select '111aa'

;with prep as
(
select *, cast(substring(str, 1, 3) as int) as str_int
from @t
where isnumeric(substring(str, 1, 3)) = 1
)

select * 
from prep
where 1=1
and case when str_int > 0 then 'Y' else 'N' end = 'Y'
--and str_int > 0

Last 2 lines are doing the same thing. First one works, but if you uncomment second one it will crash with Conversion failed when converting the varchar value 'ABC' to data type int.

Obviously, SQL Server is rewriting query mixing all the conditions together. My guess it that it considers 'case' as a havy operation and performs it as a last step. That's why workaround with case works.

Is this behavior documented in any way? or is it a bug?

like image 419
vav Avatar asked Jul 01 '14 13:07

vav


People also ask

How do I rollback changes in SQL Server?

You can see that the syntax of the rollback SQL statement is simple. You just have to write the statement ROLLBACK TRANSACTION, followed by the name of the transaction that you want to rollback.

How do I undo a SQL query?

The ROLLBACK command is the transactional command used to undo transactions that have not already been saved to the database. This command can only be used to undo transactions since the last COMMIT or ROLLBACK command was issued.

How do I check timeouts in SQL Server?

Using SQL Server Management StudioIn Object Explorer, right-click a server and select Properties. Click the Connections node. Under Remote server connections, in the Remote query timeout box, type or select a value from 0 through 2,147,483,647 to set the maximum number seconds for SQL Server to wait before timing out.

What may possible reason query timeout?

The connection or login timeout occurs when the initial connection to the database server reaches a predefined time-out period. At this stage, no query has been submitted to the server. These are examples of connection or login time-out error messages: Connection Timeout Expired.


1 Answers

This is a known issue with SQL Server, and Microsoft does not consider it a bug although users do. The difference between the two queries is the execution path. One is doing the conversion before the filtering, the other after.

SQL Server reserves the right to re-order the processing. The documentation does specify the logical processing of clauses as:

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP

With (presumably but not explicitly documented here) CTEs being logically processed first. What does logically processed mean? Well, it doesn't mean that run-time errors are caught. It really determines the scope of identifiers during the compile phase.

When SQL Server reads from a data source, it can add new variables in. This is a convenient time to do this, because everything is in memory. However, this might occur before the filtering, which is what is causing the error when it occurs.

The fix to this problem is to use a case statement. So, the following CTE will usually work:

with prep as (
      select *, (case when isnumeric(substring(str, 1, 3)) = 1 and str not like '%.%'
                      then cast(substring(str, 1, 3) as int)
                 end) as str_int
      from @t
      where isnumeric(substring(str, 1, 3)) = 1
     )

Looks weird. And I think Redmond thinks so too. SQL Server 2012 introduced try_convert() (see here) which returns NULL if the conversion fails.

It would also help if you could instruct SQL Server to materialize CTEs. That would also solve the problem in this case. You can vote on adding such an option to SQL Server here.

like image 80
Gordon Linoff Avatar answered Oct 12 '22 01:10

Gordon Linoff