I was just about to implement an override of ToString() on a particular business class in order to produce an Excel-friendly format to write to an output file, which will be picked up later and processed. Here's what the data is supposed to look like:
5555555 "LASTN SR, FIRSTN" 5555555555 13956 STREET RD TOWNSVILLE MI 48890 25.88 01-003-06-0934
It's no big deal for me to just make a format string and override ToString()
, but that will change the behavior of ToString()
for any objects I decide to serialize this way, making the implementation of ToString()
all ragged across the library.
Now, I've been reading up on IFormatProvider, and a class implementing it sounds like a good idea, but I'm still a little confused about where all this logic should reside and how to build the formatter class.
What do you guys do when you need to make a CSV, tab-delimited or some other non-XML arbitrary string out of an object?
Here is a generic fashion for creating CSV from a list of objects, using reflection:
public static string ToCsv<T>(string separator, IEnumerable<T> objectlist) { Type t = typeof(T); FieldInfo[] fields = t.GetFields(); string header = String.Join(separator, fields.Select(f => f.Name).ToArray()); StringBuilder csvdata = new StringBuilder(); csvdata.AppendLine(header); foreach (var o in objectlist) csvdata.AppendLine(ToCsvFields(separator, fields, o)); return csvdata.ToString(); } public static string ToCsvFields(string separator, FieldInfo[] fields, object o) { StringBuilder linie = new StringBuilder(); foreach (var f in fields) { if (linie.Length > 0) linie.Append(separator); var x = f.GetValue(o); if (x != null) linie.Append(x.ToString()); } return linie.ToString(); }
Many variations can be made, such as writing out directly to a file in ToCsv(), or replacing the StringBuilder with an IEnumerable and yield statements.
Here is a simplified version of Per Hejndorf's CSV idea (without the memory overhead as it yields each line in turn). Due to popular demand it also supports both fields and simple properties by use of Concat
.
This example was never intended to be a complete solution, just advancing the original idea posted by Per Hejndorf. To generate valid CSV you need to replace any text delimiter characters, within the text, with a sequence of 2 delimiter characters. e.g. a simple .Replace("\"", "\"\"")
.
After using my own code again in a project today, I realised I should not have taken anything for granted when I started from the example of @Per Hejndorf
. It makes more sense to assume a default delimiter of "," (comma) and make the delimiter the second, optional, parameter. My own library version also provides a 3rd header
parameter that controls whether a header row should be returned as sometimes you only want the data.
public static IEnumerable<string> ToCsv<T>(IEnumerable<T> objectlist, string separator = ",", bool header = true)
{
FieldInfo[] fields = typeof(T).GetFields();
PropertyInfo[] properties = typeof(T).GetProperties();
if (header)
{
yield return String.Join(separator, fields.Select(f => f.Name).Concat(properties.Select(p=>p.Name)).ToArray());
}
foreach (var o in objectlist)
{
yield return string.Join(separator, fields.Select(f=>(f.GetValue(o) ?? "").ToString())
.Concat(properties.Select(p=>(p.GetValue(o,null) ?? "").ToString())).ToArray());
}
}
so you then use it like this for comma delimited:
foreach (var line in ToCsv(objects))
{
Console.WriteLine(line);
}
or like this for another delimiter (e.g. TAB):
foreach (var line in ToCsv(objects, "\t"))
{
Console.WriteLine(line);
}
write list to a comma-delimited CSV file
using (TextWriter tw = File.CreateText("C:\testoutput.csv"))
{
foreach (var line in ToCsv(objects))
{
tw.WriteLine(line);
}
}
or write it tab-delimited
using (TextWriter tw = File.CreateText("C:\testoutput.txt"))
{
foreach (var line in ToCsv(objects, "\t"))
{
tw.WriteLine(line);
}
}
If you have complex fields/properties you will need to filter them out of the select clauses.
Here is a simplified version of Per Hejndorf's CSV idea (without the memory overhead as it yields each line in turn) and has only 4 lines of code :)
public static IEnumerable<string> ToCsv<T>(string separator, IEnumerable<T> objectlist)
{
FieldInfo[] fields = typeof(T).GetFields();
yield return String.Join(separator, fields.Select(f => f.Name).ToArray());
foreach (var o in objectlist)
{
yield return string.Join(separator, fields.Select(f=>(f.GetValue(o) ?? "").ToString()).ToArray());
}
}
You can iterate it like this:
foreach (var line in ToCsv(",", objects))
{
Console.WriteLine(line);
}
where objects
is a strongly typed list of objects.
public static IEnumerable<string> ToCsv<T>(string separator, IEnumerable<T> objectlist)
{
FieldInfo[] fields = typeof(T).GetFields();
PropertyInfo[] properties = typeof(T).GetProperties();
yield return String.Join(separator, fields.Select(f => f.Name).Concat(properties.Select(p=>p.Name)).ToArray());
foreach (var o in objectlist)
{
yield return string.Join(separator, fields.Select(f=>(f.GetValue(o) ?? "").ToString())
.Concat(properties.Select(p=>(p.GetValue(o,null) ?? "").ToString())).ToArray());
}
}
As rule of thumb I advocate only overriding toString as a tool for debugging, if it's for business logic it should be an explicit method on the class/interface.
For simple serialization like this I'd suggest having a separate class that knows about your CSV output library and your business objects that does the serialization rather than pushing the serialization into the business objects themselves.
This way you end up with a class per output format that produces a view of your model.
For more complex serialization where you're trying to write out an object graph for persistence I'd consider putting it in the business classes - but only if it makes for cleaner code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With