Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing strings recursively

Tags:

c#

regex

I am trying to extract information out of a string - a fortran formatting string to be specific. The string is formatted like:

F8.3, I5, 3(5X, 2(A20,F10.3)), 'XXX'

with formatting fields delimited by "," and formatting groups inside brackets, with the number in front of the brackets indicating how many consecutive times the formatting pattern is repeated. So, the string above expands to:

F8.3, I5, 5X, A20,F10.3, A20,F10.3, 5X, A20,F10.3, A20,F10.3, 5X, A20,F10.3, A20,F10.3, 'XXX'

I am trying to make something in C# that will expand a string that conforms to that pattern. I have started going about it with lots of switch and if statements, but am wondering if I am not going about it the wrong way?

I was basically wondering if some Regex wizzard thinks that Regular expressions can do this in one neat-fell swoop? I know nothing about regular expressions, but if this could solve my problem I am considering putting in some time to learn how to use them... on the other hand if regular expressions can't sort this out then I'd rather spend my time looking at another method.

like image 574
yu_ominae Avatar asked Nov 13 '22 10:11

yu_ominae


1 Answers

This has to be doable with Regex :) I've expanded my previous example and it test nicely with your example.

// regex to match the inner most patterns of n(X) and capture the values of n and X.
private static readonly Regex matcher = new Regex(@"(\d+)\(([^(]*?)\)", RegexOptions.None);

// create new string by repeating X n times, separated with ','
private static string Join(Match m)
{
    var n = Convert.ToInt32(m.Groups[1].Value); // get value of n
    var x = m.Groups[2].Value; // get value of X
    return String.Join(",", Enumerable.Repeat(x, n));
}

// expand the string by recursively replacing the innermost values of n(X).
private static string Expand(string text)
{
    var s = matcher.Replace(text, Join);
    return (matcher.IsMatch(s)) ? Expand(s) : s;
}

// parse a string for occurenses of n(X) pattern and expand then.
// return the string as a tokenized array.
public static string[] Parse(string text)
{
    // Check that the number of parantheses is even.
    if (text.Sum(c => (c == '(' || c == ')') ? 1 : 0) % 2 == 1)
        throw new ArgumentException("The string contains an odd number of parantheses.");

    return Expand(text).Split(new[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries);
}
like image 175
Huusom Avatar answered Nov 16 '22 02:11

Huusom