Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String Parsing in C#

What is the most efficient way to parse a C# string in the form of

"(params (abc 1.3)(sdc 2.0)(www 3.05)....)"

into a struct in the form

struct Params
{
  double abc,sdc,www....;
}

Thanks

EDIT The structure always have the same parameters (same names,only doubles, known at compile time).. but the order is not granted.. only one struct at a time..

like image 436
Betamoo Avatar asked May 03 '10 18:05

Betamoo


2 Answers

using System;

namespace ConsoleApplication1
{
    class Program
    {
        struct Params
        {
            public double abc, sdc;
        };

        static void Main(string[] args)
        {
            string s = "(params (abc 1.3)(sdc 2.0))";
            Params p = new Params();
            object pbox = (object)p; // structs must be boxed for SetValue() to work

            string[] arr = s.Substring(8).Replace(")", "").Split(new char[] { ' ', '(', }, StringSplitOptions.RemoveEmptyEntries);
            for (int i = 0; i < arr.Length; i+=2)
                typeof(Params).GetField(arr[i]).SetValue(pbox, double.Parse(arr[i + 1]));
            p = (Params)pbox;
            Console.WriteLine("p.abc={0} p.sdc={1}", p.abc, p.sdc);
        }
    }
}

Note: if you used a class instead of a struct the boxing/unboxing would not be necessary.

like image 137
Simon Chadwick Avatar answered Oct 05 '22 05:10

Simon Chadwick


Depending on your complete grammar you have a few options: if it's a very simple grammar and you don't have to test for errors in it you could simply go with the below (which will be fast)

var input = "(params (abc 1.3)(sdc 2.0)(www 3.05)....)";
var tokens = input.Split('(');
var typeName = tokens[0];
//you'll need more than the type name (assembly/namespace) so I'll leave that to you
Type t = getStructFromType(typeName);
var obj = TypeDescriptor.CreateInstance(null, t, null, null);
for(var i = 1;i<tokens.Length;i++)
{
    var innerTokens = tokens[i].Trim(' ', ')').Split(' ');
    var fieldName = innerTokens[0];
    var value = Convert.ToDouble(innerTokens[1]);
    var field = t.GetField(fieldName);
    field.SetValue(obj, value);
}

that simple approach however requires a well conforming string or it will misbehave.

If the grammar is a bit more complicated e.g. nested ( ) then that simple approach won't work.

you could try to use a regEx but that still requires a rather simple grammar so if you end up having a complex grammar your best choice is a real parser. Irony is easy to use since you can write it all in simple c# (some knowledge of BNF is a plus though).

like image 30
Rune FS Avatar answered Oct 05 '22 03:10

Rune FS