In a program I'm reading in some data files, part of which are formatted as a series of records each in square brackets. Each record contains a section title and a series of key/value pairs.
I originally wrote code to loop through and extract the values, but decided it could be done more elegantly using regular expressions. Below is my resulting code (I just hacked it out for now in a console app - so know the variable names aren't that great, etc.
Can you suggest improvements? I feel it shouldn't be necessary to do two matches and a substring, but can't figure out how to do it all in one big step:
string input = "[section1 key1=value1 key2=value2][section2 key1=value1 key2=value2 key3=value3][section3 key1=value1]";
MatchCollection matches=Regex.Matches(input, @"\[[^\]]*\]");
foreach (Match match in matches)
{
string subinput = match.Value;
int firstSpace = subinput.IndexOf(' ');
string section = subinput.Substring(1, firstSpace-1);
Console.WriteLine(section);
MatchCollection newMatches = Regex.Matches(subinput.Substring(firstSpace + 1), @"\s*(\w+)\s*=\s*(\w+)\s*");
foreach (Match newMatch in newMatches)
{
Console.WriteLine("{0}={1}", newMatch.Groups[1].Value, newMatch.Groups[2].Value);
}
}
Being a middle-level language, C reduces the gap between the low-level and high-level languages. It can be used for writing operating systems as well as doing application level programming. Helps to understand the fundamentals of Computer Theories.
C and C++ are now considered low-level languages because they have no automatic memory management. Olivier: The definition of low level has changed quite a bit since the inception of computer science. I would not qualify C as a low or high level language, but rather more like an intermediary language.
C is a powerful general-purpose programming language. It can be used to develop software like operating systems, databases, compilers, and so on.
I prefer named captures, nice formatting, and clarity:
string input = "[section1 key1=value1 key2=value2][section2 key1=value1 key2=value2 key3=value3][section3 key1=value1]";
MatchCollection matches = Regex.Matches(input, @"\[
(?<sectionName>\S+)
(\s+
(?<key>[^=]+)
=
(?<value>[^ \] ]+)
)+
]", RegexOptions.IgnorePatternWhitespace);
foreach(Match currentMatch in matches)
{
Console.WriteLine("Section: {0}", currentMatch.Groups["sectionName"].Value);
CaptureCollection keys = currentMatch.Groups["key"].Captures;
CaptureCollection values = currentMatch.Groups["value"].Captures;
for(int i = 0; i < keys.Count; i++)
{
Console.WriteLine("{0}={1}", keys[i].Value, values[i].Value);
}
}
You should take advantage of the collections to get each key. So something like this then:
string input = "[section1 key1=value1 key2=value2][section2 key1=value1 key2=value2 key3=value3][section3 key1=value1]";
Regex r = new Regex(@"(\[(\S+) (\s*\w+\s*=\s*\w+\s*)*\])", RegexOptions.Compiled);
foreach (Match m in r.Matches(input))
{
Console.WriteLine(m.Groups[2].Value);
foreach (Capture c in m.Groups[3].Captures)
{
Console.WriteLine(c.Value);
}
}
Resulting output:
section1
key1=value1
key2=value2
section2
key1=value1
key2=value2
key3=value3
section3
key1=value1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With