Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Blob and Storage Requirement

I have a requirement to store user uploaded files to the database (filesystem is not an option).

The files that are uploaded are different types (e.g. PDF, EXCEL, etc).

I have a problem on deciding whether or not to use MEDIUMBLOB as type to store these files as binary data.

And the confusion arises due to the fact that the size of these files vary with huge difference. Like some of the files are a few hundred KiloBytes (e.g. 114 KB) but some others are upto 1.5 MegaBytes.

So I really need to use MEDIUMBLOB as the column type. But I have little confusion as the actually memory taken up depends on the size of the uploaded file itself or file size itself is ignored and memory is just allocated based on the data type. This is really important because most of the files that will uploaded everyday (lot of them) will really small in size and memory (space) available in the disk is limited.

As I read the following section of MySql doc:

10.5. Data Type Storage Requirements

L represents the actual length in bytes of a given string value.

Data type                  Storage Required
========================================================
TINYBLOB, TINYTEXT  L + 1 bytes, where L < 2 ^ 8
BLOB, TEXT           L + 2 bytes, where L < 2 ^ 16
MEDIUMBLOB, MEDIUMTEXT  L + 3 bytes, where L < 2 ^ 24
LONGBLOB, LONGTEXT  L + 4 bytes, where L < 2 ^ 32

I think the memory used depends on the size of the actual file that's uploaded. i.e. If I have the column type as MEDIUMBLOB and if I upload a file that's 114 KB in size then the only 114 KBytes + 3 Bytes of disk memory will be used and not (2 ^ 24) Bytes + 3 Bytes.

Am I thinking right? Or will I be wasting a lots of disk memory by storing a lot of files (that are 100 to 300 KB in size) in a MEDIUMBLOB field.

like image 695
jombie Avatar asked Jan 19 '12 18:01

jombie


1 Answers

MySQL will give you the size of each table using SHOW TABLE STATUS. So if you upload a few test files you should be able to estimate the actual size they'll take up in the database. Each table does have some overhead so I would use as many files as reasonable for testing.

like image 146
Justin Lucas Avatar answered Oct 30 '22 00:10

Justin Lucas