Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read YAML metadata from a Pandoc markdown file

Tags:

yaml

pandoc

Is it possible to extract Pandoc's metadata (title, date, et al.) from a markdown file without a Haskell filter, or parsing the --to=json output?

The JSON output is particularly inconvenient for this, since a two-word title looks like:

$ pandoc -t json posts/test.md | jq '.meta | .title'
{
  "t": "MetaInlines",
  "c": [
    {
      "t": "Str",
      "c": "Test"
    },
    {
      "t": "Space"
    },
    {
      "t": "Str",
      "c": "post"
    }
  ]
}

so even after having jq read the title, we still need to reconstruct words, and any emphasis, code, or anything else is only going to make it more complicated.

like image 226
OJFord Avatar asked Jan 14 '17 22:01

OJFord


People also ask

What are pandoc files?

Description. Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. Pandoc can convert between numerous markup and word processing formats, including, but not limited to, various flavors of Markdown, HTML, LaTeX and Word docx.

How do you use extensions in pandoc?

An extension can be enabled by adding +EXTENSION to the format name and disabled by adding -EXTENSION . For example, --from markdown_strict+footnotes is strict Markdown with footnotes enabled, while --from markdown-footnotes-pipe_tables is pandoc's Markdown without footnotes or pipe tables.

How do I use pandoc markdown PDF?

Generating PDF from Markdown with Pandoc There are actually two steps involved in converting a Markdown file to a PDF file: The Markdown source file is converted to a LaTeX source file. Pandoc invokes the pdflatex or xelatex or other TeX command and converts the . tex source file to a PDF file.

Where are pandoc templates stored?

The default LaTeX template of Pandoc can be found at https://github.com/jgm/pandoc/tree/master/data/templates (named default. latex ).


1 Answers

We can use the template variable $meta-json$ for this.

Stick the variable in a file (with an extension, to stop Pandoc looking in it's own directories) and then use it with pandoc --template=file.ext.

Pandoc's output is a JSON object with keys "title", "date", "tags", etc. and their respective values from the markdown document, which we can easily parse, filter, and manipulate with jq.

$ echo '$meta-json$' > /tmp/metadata.pandoc-tpl
$ pandoc --template=/tmp/metadata.pandoc-tpl | jq '.title,.tags'
"The Title"
[
  "a tag",
  "another tag"
]
like image 165
OJFord Avatar answered Oct 19 '22 07:10

OJFord