Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine text format in C#

Tags:

c#

.net

format

Is there a way to determine the format of text in C#/.NET

something like this would be very useful.

public TextFormat TextTools.GetTextFormat(string text);

switch(TextTools.GetTextFormat(mystring))
{

  case TextFormat.RichText: break;
  case TextFormat.PlainText: break;

}

Ive looked around on msdn but couldnt find such a tool

like image 388
IEnumerable Avatar asked Feb 14 '23 11:02

IEnumerable


1 Answers

It's a pretty heuristic check but you can try to build your own function starting with something like this (of course you can extend it to handle different formats):

public static TextFormat GetFormat(string text) {
    if (text.TrimStart().StartsWith(@"{\rtf", StringComparison.Ordinal))
        return TextFormat.RichText;

    return TextFormat.PlainText;
}

A better check implies you parse RTF text to be sure it's not just a random string that seems RTF. Because parsing may be expansive (in terms of time) then I'd suggest to first do a quick check to exclude everything for sure isn't RTF:

public static TextFormat GetFormat(string text) {
    if (text.TrimStart().StartsWith(@"{\rtf", StringComparison.Ordinal)) {
        if (IsValidRtf(text))
            return TextFormat.RichText;
    }

    return TextFormat.PlainText;
}

In the most nested if you can decide what to do with text that seems RTF but it's not valid (in this example I just consider it as plain text). A possible, naive and inefficient, implementation of IsValidRtf() that relies on RichTextBox control implementation (then down to Windows API implementation) may be:

private static bool IsValidRtf(string text) {
    try {
        new RichTextBox().Rtf = text;
    }
    catch (ArgumentException) {
        return false;
    }
        
    return true;
}
like image 101
Adriano Repetti Avatar answered Feb 16 '23 01:02

Adriano Repetti