Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the simplest way to encoding List<String> into plain String and decode it back?

I think I've come across this requirement for a dozen times. But I could never find a satisfying solution. For instance, there are a collection of string which I want to serialize (to disk or through network) through a channel where only plain string is allowed. I almost always end up using "split" and "join" with ridiculous separator like

":::==--==:::".

like this:

public static string encode(System.Collections.Generic.List<string> data)
{
    return string.Join(" :::==--==::: ", data.ToArray());
}
public static string[] decode(string encoded)
{
    return encoded.Split(new string[] { " :::==--==::: " }, StringSplitOptions.None);
}

But this simple solution apparently has some flaws. The string cannot contains the separator string. And consequently, the encoded string can no longer re-encoded again.

AFAIK, the comprehensive solution should involve escaping the separator on encoding and unescaping on decoding. While the problem sound simple, I believe the complete solution can take significant amount of code. I wonder if there is any trick allowed me to build encoder & decoder in very few lines of code ?

like image 891
Sake Avatar asked May 12 '09 15:05

Sake


5 Answers

Add a reference and using to System.Web, and then:

public static string Encode(IEnumerable<string> strings)
{
    return string.Join("&", strings.Select(s => HttpUtility.UrlEncode(s)).ToArray());
}

public static IEnumerable<string> Decode(string list)
{
    return list.Split('&').Select(s => HttpUtility.UrlDecode(s));
}

Most languages have a pair of utility functions that do Url "percent" encoding, and this is ideal for reuse in this kind of situation.

like image 86
Daniel Earwicker Avatar answered Nov 04 '22 05:11

Daniel Earwicker


You could use the .ToArray property on the List<> and then serialize the Array - that could then be dumped to disk or network, and reconstituted with a deserialization on the other end.

Not too much code, and you get to use the serialization techniques already tested and coded in the .net framework.

like image 22
Mike Avatar answered Nov 04 '22 05:11

Mike


You might like to look at the way CSV files are formatted.

  • escape all instances of a deliminater, e.g. " in the string
  • wrap each item in the list in "item"
  • join using a simple seperator like ,

I don't believe there is a silver bullet solution to this problem.

like image 4
Adam Pope Avatar answered Nov 04 '22 05:11

Adam Pope


Here's an old-school technique that might be suitable -

Serialise by storing the width of each string[] as a fixed-width prefix in each line.

So

 string[0]="abc"
 string[1]="defg"
 string[2]=" :::==--==::: "

becomes

0003abc0004defg0014 :::==--==::: 

...where the size of the prefix is large enough to cater for the string maximum length

like image 3
Ed Guiness Avatar answered Nov 04 '22 05:11

Ed Guiness


You could use an XmlDocument to handle the serialization. That will handle the encoding for you.

public static string encode(System.Collections.Generic.List<string> data)
{
    var xml = new XmlDocument();
    xml.AppendChild(xml.CreateElement("data"));
    foreach (var item in data)
    {
        var xmlItem = (XmlElement)xml.DocumentElement.AppendChild(xml.CreateElement("item"));
        xmlItem.InnerText = item;
    }
    return xml.OuterXml;
}

public static string[] decode(string encoded)
{
    var items = new System.Collections.Generic.List<string>();
    var xml = new XmlDocument();
    xml.LoadXml(encoded);
    foreach (XmlElement xmlItem in xml.SelectNodes("/data/item"))
        items.Add(xmlItem.InnerText);
    return items.ToArray();
}
like image 2
David Avatar answered Nov 04 '22 04:11

David