Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I dynamically recreate a PDF, rather than store it in either the database or the filesystem?

I need customers to be able to download PDFs of letters that have been sent to them.

I have read the threads about database versus filesystem storage of documents or images, and it does sound like the consensus is that, for anything more than just a few images, filesystem is the way to go.

What I want to know:

  • would a reasonable alternative be to just store the letter details in the database, and recreate the PDF 'on the fly' when it is requested?
  • Is that approach superior or inferior to fetching the PDF from the filesystem?
like image 201
Tony Avatar asked Oct 22 '08 22:10

Tony


People also ask

Should I store PDF files in a database?

PDF files are unstructured or semi-structured data, which means they don't have a fixed schema. This means that it can be challenging to store PDF file contents in a traditional SQL database. However, a NoSQL database is ideal for storing PDF file contents because it doesn't require a predefined schema.

Can we store PDF files in SQL Server database?

Using file table, any documents or PDF or for that matter any file can be stored in the SQL Server.

Can we store PDF in MongoDB?

MongoDB can store files like PDF, MS-Excel, Word, etc. either in the form of Binary or Stream.


3 Answers

If it is for archival purposes, I would definitely store the PDF because in future, your PDF generation script may change and then the letter will not be exactly the same as what was originally sent. The customer will be expecting it to be exactly the same.

It doesn't matter what approach is superior, sometimes it is better to go for what approach is safer.

like image 119
Adam Pierce Avatar answered Nov 06 '22 19:11

Adam Pierce


I'd store it off for two reasons

1) If you ever change how you generate the PDF, you probably don't want historical items to change. If you generate them every time, either they will change or you need to keep compatibility code to generate "old-style" records

2) Disk space is cheap. User's patience isn't. Unless you're really pressed for storage or pulling out of storage is harder than generating the PDF, be kind to your users and store it off.

Obviously if you create thousands of these an hour from a sparse dataset, you may not have the storage. But if you have the space, I'd vote for "use it"

like image 34
Philip Rieck Avatar answered Nov 06 '22 17:11

Philip Rieck


Is there a forensics reason why you have to maintain records of letters sent to customers? If you are going to regenerate on the fly, how do you know that future code changes won't rewrite the letter (or, at least, the customer can make that argument in court if the information is used in a lawsuit)...

like image 2
Toybuilder Avatar answered Nov 06 '22 17:11

Toybuilder