I have data that looks something like this example (on an unfortunately much larger scale):
+----+-------+--------------------+-----------------------------------------------+
| ID | Data | Cost | Comments |
+----+-------+--------------------+-----------------------------------------------+
| 1 | 1|2|3 | $0.00|$3.17|$42.42 | test test||previous thing has a blank comment |
+----+-------+--------------------+-----------------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-------+--------------------+-----------------------------------------------+
| 3 | 1|2 | $3.50|$4.20 | |test |
+----+-------+--------------------+-----------------------------------------------+
Some of the columns in the table I have are pipeline delimited, but they are consistent by each row. So each delimited value corresponds to the same index in the other columns of the same row.
So I can do something like this which is what I want for a single column:
SELECT ID, s.value AS datavalue
FROM MyTable t CROSS APPLY STRING_SPLIT(t.Data, '|') s
and that would give me this:
+----+-----------+
| ID | datavalue |
+----+-----------+
| 1 | 1 |
+----+-----------+
| 1 | 2 |
+----+-----------+
| 1 | 3 |
+----+-----------+
| 2 | 1 |
+----+-----------+
| 3 | 1 |
+----+-----------+
| 3 | 2 |
+----+-----------+
but I also want to get the other columns as well (cost and comments in this example) so that the corresponding items are all in the same row like this:
+----+-----------+-----------+------------------------------------+
| ID | datavalue | costvalue | commentvalue |
+----+-----------+-----------+------------------------------------+
| 1 | 1 | $0.00 | test test |
+----+-----------+-----------+------------------------------------+
| 1 | 2 | $3.17 | |
+----+-----------+-----------+------------------------------------+
| 1 | 3 | $42.42 | previous thing has a blank comment |
+----+-----------+-----------+------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-----------+-----------+------------------------------------+
| 3 | 1 | $3.50 | |
+----+-----------+-----------+------------------------------------+
| 3 | 2 | $4.20 | test |
+----+-----------+-----------+------------------------------------+
I'm not sure what the best or most simple way to achieve this would be
This isn't going to be achievable with STRING_SPLIT
as Microsoft refuse to supply the ordinal position as part of the result set*. As a result, you'll need to use a different function which does. Personally, I recommend DelimitedSplit8k_LEAD
, originally written by Jeff Moden and then improved by Eirikur Eiriksson.
Then, you can do this:
CREATE TABLE #Sample (ID int,
[Data] varchar(200),
Cost varchar(200),
Comments varchar(8000));
GO
INSERT INTO #Sample
VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment'),
(2,'1','$420.69','test'),
(3,'1|2','$3.50|$4.20','|test');
GO
SELECT S.ID,
DSd.Item AS DataValue,
DSc.Item AS CostValue,
DSct.Item AS CommentValue
FROM #Sample S
CROSS APPLY dbo.DelimitedSplit8K_LEAD(S.[Data],'|') DSd
CROSS APPLY (SELECT *
FROM DelimitedSplit8K_LEAD(S.Cost,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSc
CROSS APPLY (SELECT *
FROM DelimitedSplit8K_LEAD(S.Comments,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSct;
GO
DROP TABLE #Sample;
GO
There is, however, only one true answer to this question: Don't store delimited values in SQL Server. Store them in a normalised manner, and you won't have this problem.
* An ordinal parameter was added to STRING_SPLIT
in SQL Server 2022.
SELECT S.ID,
SSd.Value AS DataValue,
SSc.Value AS CostValue,
SSct.Value AS CommentValue
FROM #Sample S
CROSS APPLY STRING_SPLIT(S.[Data],'|',1) SSd
CROSS APPLY (SELECT *
FROM STRING_SPLIT(S.Cost,'|',1) SS
WHERE SS.ordinal = SSd.ordinal) SSc
CROSS APPLY (SELECT *
FROM STRING_SPLIT(S.Comments,'|',1) SS
WHERE SS.ordinal = SSd.ordinal) SSct;
Here is a solution approach using a recursive CTE instead of a User Defined Funtion (UDF) which is useful for those without permission to create functions.
CREATE TABLE mytable(
ID INTEGER NOT NULL PRIMARY KEY
,Data VARCHAR(7) NOT NULL
,Cost VARCHAR(20) NOT NULL
,Comments VARCHAR(47) NOT NULL
);
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (2,'1','$420.69','test');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (3,'1|2','$3.50|$4.20','|test');
This query allows choice of delimiter by using a variable, then using a common table expression it parses each delimited string to produce a rows for each portion of those strings, and retains the ordinal position of each.
declare @delimiter as varchar(1)
set @delimiter = '|'
;with cte as (
select id
, convert(varchar(max), null) as datavalue
, convert(varchar(max), null) as costvalue
, convert(varchar(max), null) as commentvalue
, convert(varchar(max), data + @delimiter) as data
, convert(varchar(max), cost + @delimiter) as cost
, convert(varchar(max), comments + @delimiter) as comments
from mytable as t
union all
select id
, convert(varchar(max), left(data, charindex(@delimiter, data) - 1))
, convert(varchar(max), left(cost, charindex(@delimiter, cost) - 1))
, convert(varchar(max), left(comments, charindex(@delimiter, comments) - 1))
, convert(varchar(max), stuff(data, 1, charindex(@delimiter, data), ''))
, convert(varchar(max), stuff(cost, 1, charindex(@delimiter, cost), ''))
, convert(varchar(max), stuff(comments, 1, charindex(@delimiter, comments), ''))
from cte
where (data like ('%' + @delimiter + '%') and cost like ('%' + @delimiter + '%')) or comments like ('%' + @delimiter + '%')
)
select id, datavalue, costvalue, commentvalue
from cte
where datavalue IS NOT NULL
order by id, datavalue
As the recursion adds new rows, it places the first portion of the delimited strings into the wanted output columns using left()
, then also, using stuff()
, removes the last used delimiter from the source strings so that the next row will start at the next delimiter. Note that to initiate the extractions, the delimiter is added to the end of the source delimited strings which is to ensure the where clause does not exclude any of the wanted strings.
the result:
id datavalue costvalue commentvalue
---- ----------- ----------- ------------------------------------
1 1 $0.00 test test
1 2 $3.17
1 3 $42.42 previous thing has a blank comment
2 1 $420.69 test
3 1 $3.50
3 2 $4.20 test
demonstrated here at dbfiddle.uk
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With