Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to tokenize markdown using Node.js?

Im building an iOS app that have a view that is going to have its source from markdown.

My idea is to be able to parse markdown stored in MongoDB into a JSON-object that looks something like:

{
    "h1": "This is the heading",
    "p" : "Heres the first paragraph",
    "link": {
        "text": "Text for link",
        "url": "http://exampledomain.com",
    }
}

On the server I am running Node.js, and was looking at the module marked which seem to be the most popular one out there. It gives me access to the Lexer, which is tokenizing the markdown to some custom object. But when I look at the object, it doesnt tokenize the link. If I go ahead and parse the markdown to HTML, the link is detected and the HTML looks correct.

After looking into some more modules, and failing I thought that maybe I could do this on the client instead and found MMMarkdown which seemed promising, but then again .. that worked fine when parsing directly to HTML, but when stepping in between and just parsing the markdown to the so called MMDocument, it did not consist of any MMElement of type Link.

So, is there anything fundamental about markdown parsing that I am missing? Is the lexing of the inline links supposed to be done in a second round, or something? I cant get my head around it.

If nothing else works, I might just go with using a UIWebView filled withed the HTML from the parsed markdown, but then we have to design the whole thing again, but with CSS, and we are running out of time so we cant reallt afford the double work.

like image 713
bobmoff Avatar asked Feb 26 '14 12:02

bobmoff


1 Answers

Did you look at https://github.com/evilstreak/markdown-js ?

It seems to give you access to the syntax tree.

For example:

var md = require( "markdown" ).markdown,
text = "Header\n---------------\n\n" +
       "This is a paragraph\n\n" +
"This is [an example](http://example.com/ \"Title\") inline link.";

// parse the markdown into a tree and grab the link references
var tree = md.parse( text );

console.log(JSON.stringify(tree));

produces

[
    "markdown",
    [
        "header",
        {
            "level": 2
        },
        "Header"
    ],
    [
        "para",
        "This is a paragraph"
    ],
    [
        "para",
        "This is ",
        [
            "link",
            {
                "href": "http://example.com/",
                "title": "Title"
            },
            "an example"
        ],
        " inline link."
    ]
]
like image 89
Hendrik Demmer Avatar answered Sep 29 '22 08:09

Hendrik Demmer