I am writing a program in C# which essentially reads an SVG file, and does some useful things with the contents. The most complex data I will be working with are paths. They take forms such as this:
<path d="M5.4,3.806h6.336v43.276h20.738v5.256H5.4V3.806z"/>
In this case, the M, h, v, H, V, and z indicate some commands. In a way they are like functions, with the numbers following them being arguments. There are also some more complex ones:
<path d="M70.491,50.826c-2.232,1.152-6.913,2.304-12.817,2.304c-13.682,0-23.906-8.641-23.906-24.626
c0-15.266,10.297-25.49,25.346-25.49c5.977,0,9.865,1.296,11.521,2.16l-1.584,5.112C66.747,9.134,63.363,8.27,59.33,8.27
c-11.377,0-18.938,7.272-18.938,20.018c0,11.953,6.841,19.514,18.578,19.514c3.888,0,7.777-0.792,10.297-2.016L70.491,50.826z"/>
In this case, the "c" command is followed by 6 arguments (-2.232, 1.152, -6.913, 2.304, -12.817, and 2.304 in the first case). You can see how this can get tricky. My questions is this: is the SO community aware of any existing libraries that read such data into some useful ADTs?
Before I go off coding everything and writing a ton of string parsing functions, I'd really like to not re-invent the wheel. Also, any advice would be appreciated. I am aware of how to read an XML document, that isn't the issue here.
I don't know of specific libraries in c#, however you could start by parsing this kind of structure like this:
string path = "M5.4,3.806h6.336v43.276h20.738v5.256H5.4V3.806z";
string separators = @"(?=[MZLHVCSQTAmzlhvcsqta])"; // these letters are valid SVG
// commands. Whenever we find one, a new command is
// starting. Let's split the string there.
var tokens = Regex.Split(path, separators).Where(t => !string.IsNullOrEmpty(t));
now you have a list of commands followed by their arguments. You could then proceed to split the arguments in the same way.
You said the arguments can be separated by a space, a comma or a minus sign (which,unlike the comma and the whitespace, should remain part of the arguments), so you can use another simple regex (note that I'm no fan of regular expressions, but in this case I think they add to readability).
string argSeparators = @"[\s,]|(?=-)"; // discard whitespace and comma but keep the -
var splitArgs = Regex
.Split(remainingargs, argSeparators)
.Where(t => !string.IsNullOrEmpty(t));
I would wrap this in a SVGCommand class, like this
class SVGCommand
{
public char command {get; private set;}
public float[] arguments {get; private set;}
public SVGCommand(char command, params float[] arguments)
{
this.command=command;
this.arguments=arguments;
}
public static SVGCommand Parse(string SVGpathstring)
{
var cmd = SVGpathstring.Take(1).Single();
string remainingargs = SVGpathstring.Substring(1);
string argSeparators = @"[\s,]|(?=-)";
var splitArgs = Regex
.Split(remainingargs, argSeparators)
.Where(t => !string.IsNullOrEmpty(t));
float[] floatArgs = splitArgs.Select(arg => float.Parse(arg)).ToArray();
return new SVGCommand(cmd,floatArgs);
}
}
Now a simple "interpreter" could look something like this:
string path = "M70.491,50.826c-2.232,1.152-6.913,2.304-12.817,2.304c-13.682,0-23.906-8.641-23.906-24.626" +
"c0-15.266,10.297-25.49,25.346-25.49c5.977,0,9.865,1.296,11.521,2.16l-1.584,5.112C66.747,9.134,63.363,8.27,59.33,8.27" +
"c-11.377,0-18.938,7.272-18.938,20.018c0,11.953,6.841,19.514,18.578,19.514c3.888,0,7.777-0.792,10.297-2.016L70.491,50.826z";
string separators = @"(?=[A-Za-z])";
var tokens = Regex.Split(path, separators).Where(t => !string.IsNullOrEmpty(t));
// our "interpreter". Runs the list of commands and does something for each of them.
foreach (string token in tokens){
// note that Parse could throw an exception
// if the path is not correct
SVGCommand c = SVGCommand.Parse(token);
Console.WriteLine("doing something with command {0}", c.command);
}
If you need to do something more sophisticated, F# is probably better suited for the job (and is interoperable with C#). I'm not suggesting to learn F# just for this specific task, I just thought I'd mention it, in case you are already looking into it for something else.
It would be possible to do this using the WPF Geometry object. As far as I can tell the Path Markup syntax used by WPF is the same syntax as SVG path.
var data = "M5.4,3.806h6.336v43.276h20.738v5.256H5.4V3.806z";
var geometry = Geometry.Parse(data);
var pathGeometry = PathGeometry.CreateFromGeometry(geometry);
foreach (var figure in pathGeometry.Figures)
{
// Do something interesting with each path figure.
foreach (var segment in figure.Segments)
{
// Do something interesting with each segment.
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With