Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A simple way to insert a table of contents in a multiple page pdf generated using PdfPages

I am using Pandas to read data from some datafiles and generate a multiple-page pdf using PdfPages, in which each page contains the matplotlib figures from one datafile. It would be nice to be able to get a linked table of contents or bookmarks at each page, so that i can easily find figures corresponding to a given datafile . Is there a simple way to achieve this (for example by somehow inserting the name of the data file) in python 3.5?

like image 211
Sakshath Avatar asked Feb 13 '17 15:02

Sakshath


1 Answers

A simple workaround using Pandoc.

  1. First import several necessary libraries.
import os
import numpy as np
import matplotlib.pyplot as plt
  1. Draw some figures.
def draw_fig():
    # a simple case of plotting matplotlib figures.
    if not os.path.exists('fig'):
        os.mkdir('fig')
    x = np.linspace(0, 5, 100)
    for i in range(1, 6):
        y = x + i
        plt.figure()
        plt.plot(x, y)
        plt.savefig(f'fig/fig{i}.png')
  1. Create a Markdown template. In this toy example, each page contains one figure exported by matplotlib. You can tailor the function render according to your requirements.
class PdfTemplate():
    def __init__(self, figs, filename="output", toc=True):
        self.figs = figs
        self.toc = toc
        self.filename = filename
        self.text = []
        
    def render(self):
        self._pagebreak()
        for fig in self.figs:
            self._h1(fig.split(".")[0])
            self._img(os.path.join("fig", fig))
            self._pagebreak()
        self.text = "\n\n".join(self.text)
        
    def export(self):
        md_file = f"{self.filename}.md"
        pdf_file = f"{self.filename}.pdf"
        pandoc = ["pandoc", f"{md_file}", f"-o {pdf_file}"]
        with open(md_file, "w") as f:
            f.write(self.text)
        if self.toc:
            pandoc.append("--toc")
        os.system(" ".join(pandoc))
        
    def _pagebreak(self):
        self.text.append("\pagebreak")
        
    def _h1(self, text):
        self.text.append(f"# {text}")
        
    def _img(self, img):
        self.text.append(f"![]({img})")
  1. Finally, run the code and export the pdf.
draw_fig()
pdf = PdfTemplate(figs=os.listdir("fig"))
pdf.render()
pdf.export()

Contents page: Contents page

Figure page: Figure page

like image 152
Cheng Avatar answered Sep 20 '22 19:09

Cheng