Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you convert RTF text to Markdown-syntaxed plain text in Cocoa?

I need to be able to convert RTF or HTML to Markdown-syntaxed plain text for uploading to my server. I need to achieve this in Cocoa/Obj-C 2.0. Does anyone know how to do this?

Thanks so much —» Alex.


Edited Thu 4:53 PM

Umm. In answer to Yuji's comment, I'm trying to make an NSStatusItem droplet that accepts text. It doesn't matter what format the text is in, but I need to be able to format it either as plain text or plain text formatted with Markdown. I guess since I don't know what kind of text I'll be receiving...

like image 226
Alexsander Akers Avatar asked May 20 '10 18:05

Alexsander Akers


3 Answers

Here are the formats pandoc parses and writes:

> pandoc --help
pandoc [OPTIONS] [FILES]

Input formats:  native, markdown, markdown+lhs, rst, rst+lhs, html, 
latex, latex+lhs

Output formats:  native, html, html+lhs, s5, docbook, opendocument, odt, latex, 
latex+lhs, context, texinfo, man, markdown, markdown+lhs, plain, rst, rst+lhs, 
mediawiki, rtf

Unfortunately rtf isn't one of the formats it parses. It is a Haskell program, so it isn't convenient to get it without installing the Haskell Platform. From a parsed document, it can write a sort of 'plain' sub-Markdown, or standard Markdown, or its own enriched Markdown, as well as a pile of other formats. The internal ('native') representation is much richer than the standard Markdown spec requires, so less information will be lost, and you will be able to recover the html for your markdown -- or make a pdf via latex, etc. It is fairly easy to hack at it for special purposes.

I don't know if any of them are stable but there is an increasing number of bindings to the Pandoc libraries from other languages around. A search of Github suggests that the most relevant looking for hooking up with Obj C is the plain C libpandoc. Ruby has the most activity, it seems -- I guess because it's github -- with pandoku, pandoc-ruby, rails-pandoc and so forth.

like image 146
applicative Avatar answered Nov 04 '22 16:11

applicative


Oooph, this is going to be tricky. As Yuji said, you can express a lot more in HTML/RTF than in markdown. That being the case...

I'd convert the content into an NSAttributedString. You can easily construct an NSAttributedString from RTF data; HTML will be much more difficult. Once you do that, however, it'll be a matter of inspecting all the attributes on the string and applying the equivalent markdown to a plaintext version of the content.

Researching a bit more:

  • Markdownify - convert HTML to Markdown in PHP
  • Pandoc - convert markdown (and some formats) to other rich text formats. It supports Markdown => RTF, so you could perhaps use that to create an inverse conversion.
like image 44
Dave DeLong Avatar answered Nov 04 '22 18:11

Dave DeLong


There's an online form that does just this: MarkItDown

like image 2
Bambax Avatar answered Nov 04 '22 16:11

Bambax