Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient way to concatenate a string from all parent rows using T-SQL?

I have a table that has a self-referencing foreign key that represents its parent row. To illustrate the problem in its simplest form we'll use this table:

CREATE TABLE Folder(
    id int IDENTITY(1,1) NOT NULL, --PK
    parent_id int NULL,        --FK
    folder_name varchar(255) NOT NULL)

I want to create a scalar-valued function that would return a concatenated string of the folder's name and all its parent folder names all the way to the root folder, which would be designated by a null parent_id value.

My current solution is a procedural approach which I assume is not ideal. Here is what I'm doing:

CREATE FUNCTION dbo.GetEntireLineage
    (@folderId INT)
    RETURNS VARCHAR(MAX)
AS
BEGIN
    DECLARE @lineage VARCHAR(MAX)
    DECLARE @parentFolderId INT

    SELECT @lineage = folder_name, @parentFolderId = parent_id FROM Folder WHERE id = @folderId

WHILE NOT @parentFolderId IS NULL
    BEGIN
        SET @parentFolderId = (SELECT parent_id FROM Folder WHERE parent_id = @parentFolderId)
        SET @lineage = (SELECT @lineage + '-' + (SELECT folder_name FROM Folder WHERE parent_id = @parentFolderId))
    END
RETURN @lineage
END

Is there a more ideal way to do this? I'm an experienced programmer but T-SQL not a familiar world to me and I know these problems generally require a different approach due to the nature of set based data. Any help finding a solution or any other tips and tricks to deal with T-SQL would be much appreciated.

like image 488
w.brian Avatar asked Jul 14 '11 03:07

w.brian


People also ask

What is the most efficient way to concatenate many strings together?

Concatenate Many Strings using the Join As shown above, using the join() method is more efficient when there are many strings.

How do I concatenate strings from different rows in SQL?

You can concatenate rows into single string using COALESCE method. This COALESCE method can be used in SQL Server version 2008 and higher. All you have to do is, declare a varchar variable and inside the coalesce, concat the variable with comma and the column, then assign the COALESCE to the variable.

What are the 2 methods used for string concatenation?

There are two ways to concatenate strings in Java: By + (String concatenation) operator. By concat() method.

Which is the efficient way of concatenating the string in Java?

StringBuilder is the winner and the fastest way to concatenate Strings. StringBuffer is a close second, because of the synchronized method, and the rest of them are just 1000 times slower than them.


2 Answers

To know for sure about performance you need to test. I have done some testing using your version (slightly modified) and a recursive CTE versions suggested by others.

I used your sample table with 2048 rows all in one single folder hierarchy so when passing 2048 as parameter to the function there are 2048 concatenations done.

The loop version:

create function GetEntireLineage1 (@id int)
returns varchar(max)
as
begin
  declare @ret varchar(max)

  select @ret = folder_name,
         @id = parent_id
  from Folder
  where id = @id

  while @@rowcount > 0
  begin
    select @ret = @ret + '-' + folder_name,
           @id = parent_id
    from Folder
    where id = @id
  end
  return @ret
end

Statistics:

 SQL Server Execution Times:
   CPU time = 125 ms,  elapsed time = 122 ms.

The recursive CTE version:

create function GetEntireLineage2(@id int)
returns varchar(max)
begin
  declare @ret varchar(max);

  with cte(id, name) as
  (
    select f.parent_id,
           cast(f.folder_name as varchar(max))
    from Folder as f
    where f.id = @id
    union all
    select f.parent_id,
           c.name + '-' + f.folder_name
    from Folder as f
      inner join cte as c
        on f.id = c.id
  )
  select @ret = name
  from cte
  where id is null
  option (maxrecursion 0)

  return @ret
end

Statistics:

 SQL Server Execution Times:
   CPU time = 187 ms,  elapsed time = 183 ms.

So between these two it is the loop version that is more efficient, at least on my test data. You need to test on your actual data to be sure.

Edit

Recursive CTE with for xml path('') trick.

create function [dbo].[GetEntireLineage4](@id int)
returns varchar(max)
begin
  declare @ret varchar(max) = '';

  with cte(id, lvl, name) as
  (
    select f.parent_id,
           1,
           f.folder_name
    from Folder as f
    where f.id = @id
    union all
    select f.parent_id,
           lvl + 1,
           f.folder_name
    from Folder as f
      inner join cte as c
        on f.id = c.id
  )
  select @ret = (select '-'+name
                 from cte
                 order by lvl
                 for xml path(''), type).value('.', 'varchar(max)')
  option (maxrecursion 0)

  return stuff(@ret, 1, 1, '')
end

Statistics:

 SQL Server Execution Times:
   CPU time = 31 ms,  elapsed time = 37 ms.
like image 131
Mikael Eriksson Avatar answered Oct 14 '22 12:10

Mikael Eriksson


use a recursive query to traverse the parents and then this method for concatenating into a string.

like image 40
gordy Avatar answered Oct 14 '22 12:10

gordy