Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert HTML to Markdown while retaining non-markdown HTML tags?

I'd like to be able to take an existing HTML snippet and convert it to markdown. I've tried pandoc for this purpose:

pandoc test.html -o test.md

where test.html looked like this:

Hello

<!-- more -->

and some more text

<h2>some heading</h2>       

The result was this:

Hello and some more text

some heading
------------

Thus, it not only converts tags that have a direct meaning in markdown. It also removes tags that I would like to retain as HTML (e.g., HTML comments, iframe tags, and so on).

  • How can I convert HTML to markdown in a way that any tags that don't have an equivalent in markdown are retained as raw HTML?
  • More generally how can I have control over how the HTML to markdown conversion is done?

In particular, I'd be interested in command-line program options. For example, perhaps there are options that can be supplied to pandoc.

like image 468
Jeromy Anglim Avatar asked Apr 27 '13 06:04

Jeromy Anglim


People also ask

Can you convert HTML to Markdown?

How does HTML to Markdown work? HTML to Markdown uses JavaScript libs for conversions. Just Paste your HTML code and click HTML to Markdown. This tool does not send code to server for converting to Markdown.

How to convert Markdown to HTML in HTML?

To convert Markdown to HTML using Typora, click File —> Export —> HTML. Then save the file in your preferred location. The image below shows that the HTML output looks exactly as how the Markdown is displayed inside Typora.

Can Pandoc convert HTML to Markdown?

Pandoc can convert between numerous markup and word processing formats, including, but not limited to, various flavors of Markdown, HTML, LaTeX and Word docx.


1 Answers

After a bit more searching, I read about the --parse-raw option in a thread on table parsing.

Adding the --parse-raw option seemed to not strip the non-markdown equivalent HTML tags.

pandoc test.html -o test.md --parse-raw
like image 53
Jeromy Anglim Avatar answered Sep 19 '22 14:09

Jeromy Anglim