Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preserve Line Breaks in Pandoc Markdown -> LaTeX Conversion

Tags:

pandoc

I want to convert the following *.md converted into proper LaTeX *.tex.

Lorem *ipsum* something.
Does anyone know lorem by heart?

That would *sad* because there's always Google.

Expected Behavior / Resulting LaTeX from Pandoc

Lorem \emph{ipsum} something.
Does anyone know lorem by heart?

That would \emph{sad} because there's always Google.

Observed Behavior / Resulting LaTeX from Pandoc

Lorem \emph{ipsum} something. Does anyone know lorem by heart?

That would \emph{sad} because there's always Google.

Why do I care? 1. I'm transitioning a bigger git repo from markdown to LaTeX, and I want a clean diff and history. 2. I actually like my LaTeX with one sentence-per-line even though it does not matter for the typesetting.

How can I get Pandoc to do this?

Ps.: I am aware of the option hard_line_breaks, but that only adds \\ between the two first lines, and does not actually preserve my line breaks.

like image 220
maxheld Avatar asked Sep 26 '14 19:09

maxheld


People also ask

Can Pandoc convert HTML to markdown?

Pandoc can convert between numerous markup and word processing formats, including, but not limited to, various flavors of Markdown, HTML, LaTeX and Word docx.

Is Pandoc secure?

The python package pandoc was scanned for known vulnerabilities and missing license, and no issues were found. Thus the package was deemed as safe to use.

How do I convert markdown to PDF in Pandoc?

Generating PDF from Markdown with Pandoc There are actually two steps involved in converting a Markdown file to a PDF file: The Markdown source file is converted to a LaTeX source file. Pandoc invokes the pdflatex or xelatex or other TeX command and converts the . tex source file to a PDF file.

How do you use extensions in Pandoc?

An extension can be enabled by adding +EXTENSION to the format name and disabled by adding -EXTENSION . For example, --from markdown_strict+footnotes is strict Markdown with footnotes enabled, while --from markdown-footnotes-pipe_tables is pandoc's Markdown without footnotes or pipe tables.


3 Answers

Update

Since pandoc 1.16, this is possible:

pandoc --wrap=preserve

Old answer

Since Pandoc converts the Markdown to an AST-like internal representation, your non-semantic linebreaks are lost. So what you're looking for is not possible without some custom scripting (like using --no-wrap and then processing the output by inserting a line-break wherever there is a dot followed by a space).

However, you can use the --columns NUMBER options to specify the number of characters on each line. So you won't have a sentence per line, but NUMBER of characters per line.

like image 123
mb21 Avatar answered Oct 14 '22 18:10

mb21


A much simpler solution would be to add two spaces after "...something.". This will add a manual line break (the method is mentioned in the Pandoc Manual).

like image 23
René Avatar answered Oct 14 '22 18:10

René


I figured out another way to address this problem – which is to not change the original *.mds (under version control), but to simply read them in and to have them "pandoced" when building the PDF.

Here's how:

Some markdown.md in project root:

Happy one-sentence-per-line **markdown** stuff.
And another line – makes for clear git diffs!

And some latexify.tex in project root:

\documentclass{article}
\begin{document}

\immediate\write18{pandoc markdown.md -t latex -o tmp.tex}
\input{tmp.tex}

\end{document}

Works just dandy if you have some markdown components in a latex project, e.g. github READMEs or sth.

Requires no special package, but compilation with shell-escape enabled.

like image 23
maxheld Avatar answered Oct 14 '22 19:10

maxheld