Is there a way to determine the format of text in C#/.NET
something like this would be very useful.
public TextFormat TextTools.GetTextFormat(string text);
switch(TextTools.GetTextFormat(mystring))
{
case TextFormat.RichText: break;
case TextFormat.PlainText: break;
}
Ive looked around on msdn but couldnt find such a tool
It's a pretty heuristic check but you can try to build your own function starting with something like this (of course you can extend it to handle different formats):
public static TextFormat GetFormat(string text) {
if (text.TrimStart().StartsWith(@"{\rtf", StringComparison.Ordinal))
return TextFormat.RichText;
return TextFormat.PlainText;
}
A better check implies you parse RTF text to be sure it's not just a random string that seems RTF. Because parsing may be expansive (in terms of time) then I'd suggest to first do a quick check to exclude everything for sure isn't RTF:
public static TextFormat GetFormat(string text) {
if (text.TrimStart().StartsWith(@"{\rtf", StringComparison.Ordinal)) {
if (IsValidRtf(text))
return TextFormat.RichText;
}
return TextFormat.PlainText;
}
In the most nested if
you can decide what to do with text that seems RTF but it's not valid (in this example I just consider it as plain text). A possible, naive and inefficient, implementation of IsValidRtf()
that relies on RichTextBox
control implementation (then down to Windows API implementation) may be:
private static bool IsValidRtf(string text) {
try {
new RichTextBox().Rtf = text;
}
catch (ArgumentException) {
return false;
}
return true;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With