Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML to PDF on Google AppEngine

We're currently trying to convert html files to PDF on AppEngine using Python. The HTML files are from a third-party vendor so we have no control over their format. Both the Flexible and Standard environments are options, but every path we go down we seem to hit a roadblock:

  • PDFkit requires a wkhtml2pdf install, no PIP package available, however converts perfectly offline
  • xhtml2pdf / PISA - works even on GAE Standard but doesn't support many features such as float and badly formatted HTML
  • WeasyPrint - C dependencies in theory would run on the Flexible environment but no pip packages available for dependencies including Cairo and Pango

Has anyone got a robust solution running on AppEngine with any of the above? Or with other libraries I am missing?

like image 695
Tim Owens Avatar asked Apr 16 '18 13:04

Tim Owens


People also ask

What is go App Engine?

Google App Engine is a service and a platform where you can develop and host web applications. You can learn more about Google App Engine at the official Google App Engine site. With App Engine integration, you can run and debug Google App Engine applications.

What is the role of Google Appengine?

Google App Engine (GAE) is a platform-as-a-service product that provides web app developers and enterprises with access to Google's scalable hosting and tier 1 internet service. GAE requires that applications be written in Java or Python, store data in Google Bigtable and use the Google query language.

What type of service model is Google Appengine using?

App Engine is a fully managed, serverless platform for developing and hosting web applications at scale. You can choose from several popular languages, libraries, and frameworks to develop your apps, and then let App Engine take care of provisioning servers and scaling your app instances based on demand.


1 Answers

I ran into this same problem a year back and concluded that this is currently not possible in App Engine, at least with a good quality conversion. (Someone please point out if things have changed)

xhtml2pdf - I was able to successfully run it in standard App Engine but not at all happy with the conversion quality.

PDFkit - Ran into a similar problem and came up with a different solution. Hosted PDFkit on a Compute Engine Instance and exposed an endpoint wherein a POST request with the HTML file will return the converted PDF as a response. This gave me the best/expected results in terms of quality/speed of processing.

It did incur some extra charges but I was able to utilize the instance for something else too ;). I chose the least possible configuration initially since I was not storing anything on the Compute Engine Instance.

like image 102
Shubham Sinha Avatar answered Oct 10 '22 10:10

Shubham Sinha