Why is my delimiter not appearing in the final output? It's initialized to be a comma, but I only get ~5 white spaces between each attribute using:
SELECT [article_id]
, dbo.GROUP_CONCAT(0, t.tag_name, ',') AS col
FROM [AdventureWorks].[dbo].[ARTICLE_TAG_XREF] atx
JOIN [AdventureWorks].[dbo].[TAGS] t ON t.tag_id = atx.tag_id
GROUP BY article_id
The bit for DISTINCT works fine, but it operates within the Accumulate scope...
Output:
article_id | col
-------------------------------------------------
1 | a a b c
Update: The excess space between values is because the column as defined as NCHAR(10), so 10 characters would appear in the output. Silly mistake on my part...
With Martin Smith's help about working with the Write(BinaryWriter w)
method, this update works for me:
public void Write(BinaryWriter w)
{
w.Write(list.Count);
for (int i = 0; i < list.Count; i++ )
{
if (i < list.Count - 1)
{
w.Write(list[i].ToString() + delimiter);
}
else
{
w.Write(list[i].ToString());
}
}
}
Why does the above solve my problem? And why wouldn't it let me use more than one w.write
call inside the FOR loop?
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Xml.Serialization;
using System.Xml;
using System.IO;
using System.Collections;
using System.Text;
[Serializable]
[SqlUserDefinedAggregate(Format.UserDefined, MaxByteSize = 8000)]
public struct GROUP_CONCAT : IBinarySerialize
{
ArrayList list;
string delimiter;
public void Init()
{
list = new ArrayList();
delimiter = ",";
}
public void Accumulate(SqlBoolean isDistinct, SqlString Value, SqlString separator)
{
delimiter = (separator.IsNull) ? "," : separator.Value ;
if (!Value.IsNull)
{
if (isDistinct)
{
if (!list.Contains(Value.Value))
{
list.Add(Value.Value);
}
}
else
{
list.Add(Value.Value);
}
}
}
public void Merge(GROUP_CONCAT Group)
{
list.AddRange(Group.list);
}
public SqlString Terminate()
{
string[] strings = new string[list.Count];
for (int i = 0; i < list.Count; i++)
{
strings[i] = list[i].ToString();
}
return new SqlString(string.Join(delimiter, strings));
}
#region IBinarySerialize Members
public void Read(BinaryReader r)
{
int itemCount = r.ReadInt32();
list = new ArrayList(itemCount);
for (int i = 0; i < itemCount; i++)
{
this.list.Add(r.ReadString());
}
}
public void Write(BinaryWriter w)
{
w.Write(list.Count);
foreach (string s in list)
{
w.Write(s);
}
}
#endregion
}
The problem here is that you do not serialize delimiter. Add:
w.Write(delimiter)
as a first line in your Write method and
delimiter = r.ReadString();
as a first line in your Read method.
Regarding your questions to suggested work-around:
Why does the above solve my problem?
It does not. It merely worked with your test scenario.
And why wouldn't it let me use more than one w.write call inside the FOR loop?
Write method needs to be compatible with Read method. If you write two strings and read only one then it is not going to work. The idea here is that your object may be removed from the memory and then loaded. This is what Write and Read are supposed to do. In your case - this indeed was happening and you were not able to keep the object value.
The answer given by @agsamek is correct but not complete. The query processor may instantiate multiple aggregators, e.g. for parallel computations, and the one that will eventually hold all data after successive calls of Merge()
may be assigned an empty recordset, i.e. its Accumulate()
method may be never called:
var concat1 = new GROUP_CONCAT();
concat1.Init();
results = getPartialResults(1); // no records returned
foreach (var result in results)
concat1.Accumulate(result[0], delimiter); // never called
...
var concat2 = new GROUP_CONCAT();
concat2.Init();
results = getPartialResults(2);
foreach (var result in results)
concat2.Accumulate(result[0], delimiter);
...
concat1.Merge(concat2);
...
result = concat1.Terminate();
In this scenario, concat1
's private field delimiter
used in Terminate()
remains what it is by default in Init()
but not what you pass in SQL. Luckily or not, your test SQL uses the same delimiter value as in Init()
, so you can't reveal the difference.
I'm not sure if this is a bug or if it has been fixed in later versions (I stumbled on it in SQL Server 2008 R2). My workaround was to make use of the other group that is passed in Merge()
:
public void Merge(GROUP_CONCAT Group)
{
if (Group.list.Count != 0) // Group's Accumulate() has been called at least once
{
if (list.Count == 0) // this Accumulate() has not been called
delimiter = Group.delimiter;
list.AddRange(Group.list);
}
}
P.S. I would use StringBuilder
instead of ArrayList
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With