Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MarkdownSharp is converting Url's that contain underscore characters

I am using MarkdownSharp in one of my projects and have noticed that if any of my Url's contain pairs of underscore characters somewhere within it, it's treated as italic and therefore replaces the _ with <em>.

I've had a look on google but can't find any reference to this problem behaviour, and from reading some of the comments in the MarkdownSharp code, it suggests that the code is written the way it is to prevent this from happening. See below snippet from the markdown code:

The order in which other subs are called here is essential. Link and image substitutions need to happen before EscapeSpecialChars(), so that any *'s or _'s in the a and img tags get encoded.

    public string Transform(string text)
    {
        if (String.IsNullOrEmpty(text)) return "";

        Setup();

        text = Normalize(text);

        text = HashHTMLBlocks(text);
        text = StripLinkDefinitions(text);
        text = RunBlockGamut(text);
        text = Unescape(text);

        Cleanup();

        return text + "\n";
    }

Is there a known workaround for this behaviour?

** UPDATE: I have just tested entering a url on StackOverflow which I believe uses a version of MarkdownSharp (and AutoHyperlink is enabled as per my project) and whilst it handles a single underscore instance within the url, as soon as a pair of underscores appear in the url, it breaks.

like image 924
marcusstarnes Avatar asked Jul 01 '26 22:07

marcusstarnes


1 Answers

MarkdownSharp has a configuration option that was created for this very reason:

/// <summary>
/// when true, bold and italic require non-word characters on either side  
/// WARNING: this is a significant deviation from the markdown spec
/// </summary>
public bool StrictBoldItalic { get; set; }

For some background info, see point 1. in https://blog.stackoverflow.com/2008/06/three-markdown-gotcha/.

like image 102
balpha Avatar answered Jul 03 '26 14:07

balpha