I had an interesting interview question the other day, which I really struggled with. The (highly ambitious) spec required me to write, in C#, parsers for two different data streams. Here is a made-up example of the first stream:
30=EUR/USD,35=3,50=ON,51=12.5,52=13.5,50=6M,51=15.4,52=16.2,50=1Y,51=17.2,52=18.3
where 30 is the currency pair, 35 is the number of tenors, and 50,51,52 are the tenor,bid and ask respectively. The bid and ask are optional, but a correct tenor-bid-ask tuple will have at least one of the two prices. The framework code they supplied implied that the result of parsing this line should be 3 individual objects (DataElement instances). I ended up with a rather nasty switch-statement and loop-based implementation that I am not sure actually worked.
What techniques are there for reading this kind of stream? I tried to figure out something with recursion, which I couldn't get right.
EDIT: Based on @evanmcdonnall's answer (accepted) here is the fully compiling and working code, in case it's useful for anyone else.
List<DataElement> Parse(string row)
{
string currency=string.Empty;
DataElement[] elements = null;
int j = 0;
bool start = false;
string[] tokens = row.Split(',');
for (int i = 0; i < tokens.Length; i++)
{
string[] kv = tokens[i].Split('=');
switch (kv[0])
{
case "30":
currency = kv[1];
break;
case "35":
elements = new DataElement[int.Parse(kv[1])];
break;
case "50":
if (start)
j++;
elements[j] = new DataElement() { currency = currency, tenor = kv[1] };
start = true;
break;
case "51":
elements[j].bid = double.Parse(kv[1]);
break;
case "52":
elements[j].ask = double.Parse(kv[1]);
break;
}
}
return elements.ToList();
}
The main concepts are:
I don't see what's so tricky about it. However, I don't see any solution that would be better than the very specific, iteration with many conditionals solution I have in mind.
First you split on commas, then you loop over those tokens, splitting each on the equal sign to get you key value pair. You have checks for each key and a bool to track when you start/finish an item. You read the currency and use that for each object. You read key 35
and find there are 3 objects, so you allocate an array of three objects, each with 3 properties; tenor, bid, and ask. When you encounter 50 you should set a your start true. You set 50, 51, and 52 if they're there. Below is some sample code;
string currency;
int j = 0;
bool start = false;
string[] tokens = line.Split(',');
for (int i =0; i < tokens.length; i++)
{
string[] kv = tokens[i].Split('=')
if (kv[0] == 30)
currency = kv[1]
elseif (kv[0] == 35)
{
DateElement[] elements = new DataElement[kv[1]];
}
elseif (kv[0] == 50)
{
if (start)
j++;
start = true; // flip your flag after the condition so it works for element 0
elements[j].currency = currency;
elements[j].tenor = kv[1];
}
elseif (kv[0] == 51)
elements[j].bid = kv[1];
elseif (kv[0] == 52)
elements[j].ask = kv[1];
// if these optional values aren't there we'll just fall back into the case for 50
// and everything will work as expected.
}
The code may not be pretty, but the logic is fairly trivial and, assuming the lines format is correct, it will always work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With