Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert plain text to PDF in Python

For my project, I get a plain text file (report.txt) from another program. It is all formatted in plain text. If you open it in Notepad, it looks nice (as much as a plain text file can). When I open the file in Word and show the paragraphs, I see the ... for spaces and the backwards P for pararaph.

I need to convert this file to PDF and add some other PDF pages to make one final PDF. All this happens in Python.

I am having trouble converting the report.txt to pdf. I have ReportLab, and am able to read the file and make a few changes (like change the text to Courier), but the spacing gets lost. When the file gets read, it appears to strip any extra spaces.

Questions: a) is there an easier way to convert the report.txt to pdf? b) If not, is there a way to keep my spaces when I read the file? c) Or is there a parameter I'm missing from my paragraph style that will keep the original look?

Here's my code:

# ------------------------------------
# Styles
# ------------------------------------

styleSheet = getSampleStyleSheet()
mystyle = ParagraphStyle(name='normal',fontName='Courier',
                         fontSize=10, 
                         alignment=TA_JUSTIFY, 
                         leading=1.2*12,
                         parent=styleSheet['Normal'])       

#=====================================================================================       
model_report = 'report.txt'

# Create document for writing to pdf  
doc = SimpleDocTemplate(str(pdfPath),  \
                        rightMargin=40, leftMargin=40, \
                        topMargin=40, bottomMargin=25, \
                        pageSize=A4)
doc.pagesize = portrait(A4)

# Container for 'Flowable' objects
elements = []    

# Open the model report
infile   = file(model_report).read()
report_paragraphs = infile.split("\n")

for para in report_paragraphs:  
    para1 = '<font face="Courier" >%s</font>' % para 
    elements.append(Paragraph(para1, style=mystyle))
doc.build(elements)
like image 873
user1327390 Avatar asked Apr 11 '12 19:04

user1327390


2 Answers

I've created a small helper function to convert a multi-line text to a PDF file in a "report look" by using a monospaced font. Too long lines are wrapped at spaces so that it will fit the page width:

import textwrap
from fpdf import FPDF

def text_to_pdf(text, filename):
    a4_width_mm = 210
    pt_to_mm = 0.35
    fontsize_pt = 10
    fontsize_mm = fontsize_pt * pt_to_mm
    margin_bottom_mm = 10
    character_width_mm = 7 * pt_to_mm
    width_text = a4_width_mm / character_width_mm

    pdf = FPDF(orientation='P', unit='mm', format='A4')
    pdf.set_auto_page_break(True, margin=margin_bottom_mm)
    pdf.add_page()
    pdf.set_font(family='Courier', size=fontsize_pt)
    splitted = text.split('\n')

    for line in splitted:
        lines = textwrap.wrap(line, width_text)

        if len(lines) == 0:
            pdf.ln()

        for wrap in lines:
            pdf.cell(0, fontsize_mm, wrap, ln=1)

    pdf.output(filename, 'F')

This is how you would use this function to convert a text file to a PDF file:

input_filename = 'test.txt'
output_filename = 'output.pdf'
file = open(input_filename)
text = file.read()
file.close()
text_to_pdf(text, output_filename)
like image 108
m13r Avatar answered Oct 04 '22 02:10

m13r


ReportLab is the usual recommendation-- as you can see from the "Related" questions on the right side of this page.

Have you tried creating text with just StyleSheet['Normal']? I.e., if you get proper-looking output with the following, the problem is somehow with your style.

Paragraph(para1, style=StyleSheet['Normal'])
like image 44
alexis Avatar answered Oct 04 '22 03:10

alexis