Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Relative Performance in SQLServer of Substring vs a Right-Left combo

This is a performance based question, not a "I don't understand" or "best practice" question.

I have a varchar field in a SQLServer database that is guaranteed to be longer than 7 chars. I need to extract a char(4) field consisting of the 2nd, 3rd, 4th and 5th chars in the varchar.

For example if the varchar had the value 1234567890 I would be looking for the 2345 part.

Is there a performance benefit to using a substring over a right-left combo?

SELECT SUBSTRING(account,2,4) FROM payment

or

SELECT RIGHT(LEFT(account,5),4) FROM payment

I have noticed a slight advantage by using a right-left on a table with 1,760,335 records, but I am not sure if this is due to caching queries or the like.

UPDATE I've done a bit more homework. It seems that in this case the Right-Left is ultimately performed as a Right-Substring. Is this a rule? or is it just the way SQLServer decided to skin this particular cat? alt text

like image 844
Ron Tuffin Avatar asked Sep 30 '10 06:09

Ron Tuffin


People also ask

Is substring faster than left?

There is no difference at all between left and substring because left is translated to substring in the execution plan.

Can substring be used in where clause?

The SUBSTRING SQL function is very useful when you want to make sure that the string values returned from a query will be restricted to a certain length. In the following example, using the 'firstname' column, the last two characters are matched with the word 'on' using the SQL SUBSTRING function in the where clause.

How do I select a substring in SQL after a specific character?

To find the index of the specific character, you can use the CHARINDEX(character, column) function where character is the specific character at which you'd like to start the substring (here, @ ). The argument column is the column from which you'd like to retrieve the substring; it can also be a literal string.


1 Answers

+1 for an interesting question. Your assessment that SQL Server may significantly change each statement through optimization is probably accurate; same as your assessment that in such a large set SQL Server may be able to cache one query better than another.

Two other things come to mind that might be (vaguely) relevant:

  • Memory consumption; I would be curious if the LEFT/RIGHT combo consumes slightly more memory. In theory, the return value of the first function would need to be stored so that it could be passed into the second function, though the same register might be used over and over.

  • Bounds checking. A varchar is basically a pointer to the beginning of a char[] with 2 extra bytes for indicating length. This suggests that some sort of bounds checking would need to be performed when accessing a value by an index by looking at the value contained in those 2 bytes to make sure it wasn't out of range.

SQL Server is also very forgiving when making requests outside the limits of the string with both chars and varchars. The following code will run without any errors.

DECLARE @Test varchar(50);
SET @Test = 'Hello World';
SELECT substring(@Test, 2, 4);
SELECT substring(@Test, 2000, 5000);

So will:

SELECT right(left(@test, 500), 400);

My guess is that the explanation for the answer to your question lies in something related; unfortunately I don't know the answer to your question.

I would be curious if you got the same performance results using a longer string, or a char versus a varchar. Those tests could yield more insights into the internals of SQL Server.

like image 187
Tim M. Avatar answered Oct 02 '22 00:10

Tim M.