I have a latex-generated .toc file with the table of contents of a large document. I would like to extract the TOC to a (github-)markdown list e.g. with pandoc.
e.g. I have
\contentsline {chapter}{\numberline {1}Introduction}{1}{chapter.1}
\contentsline {section}{\numberline {1.1}Aim and Motivation}{1}{section.1.1}
\contentsline {section}{\numberline {1.2}State of the art}{1}{section.1.2}
\contentsline {section}{\numberline {1.3}Outline}{1}{section.1.3}
\contentsline {chapter}{\numberline {2}Fundamentals}{2}{chapter.2}
...
in my .toc file.
And would like to get something like this
1. Introduction
1.1. Aim and Motivation
1.2. State-of-the-art
1.3. Outline
2. Fundamentals
Another alternative would be to extract this information (without the content) out of the tex-file directly. However, I could not get this working and I also think it would be more error-prone.
Any suggestions?
Another alternative would be to extract this information out of the tex-file directly.
Pandoc can do that:
$ pandoc -s --toc input.tex -o output.md
To exclude the document body content, you'll have to use a custom pandoc markdown template:
$ pandoc -D markdown > mytemplate.md
Modify mytemplate.md
to keep $toc$
and remove $body$
, then use with pandoc --template mytemplate.md ...
If you want to customize it more I would recommend outputting to html (pandoc -t html
) instead of markdown, then write a small script that traverses the html DOM and does your numbering etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With