I need to store a page of html data in a field called ProjectDescription in mySQL database using Spring & JPA 2.1 . I have read this question and all other questions with BLOB tag, but need some clarity on why the fields are saving in a certain manner in my db. I have created the fields in the manner below using both String and byte[] field types.
Method1: Saving data as TEXT (After encoding in Base64 format I save my html data as String using the below method)
@Basic(fetch = FetchType.LAZY)
@Lob
private String projectDescription = "";
Method2: Saving data as binary using BLOB
@Basic(fetch = FetchType.LAZY)
@Lob
@Column(length=5000)
private byte[] projectDescription1 =new byte[0];
My assumptions: I assume that since a page of html data is not very large, TEXT is ok as comapred to BLOB
I tested both and the fields are saved as below in mySQL database
In Method1:
- Type: TEXT
- DisplaySize is constantly 1431655765.
This size doesn't change irrespective of my @Column(length=5000) annotaton.
In Method2
- Type: BLOB
- DisplaySize: -1
Question1: What is the source of this DisplaySize ? This seems quite large in case of TEXT and very small(-1) in case of byte[] field type. Why doesn't @Column length seem to change change the DisplaySize.
Question2: Is it ok to store HTML data as String field type(eventually as TEXT) as opposed to byte[] (eventually as a blob)?
Note: I have read all the questions with BLOB tags and am clear that images/documents need to be saved as BLOB and text as as CLOB/TEXT. However, would like to confirm again for HTML data given how large DisplaySize is allocated in DB for TEXT.
Thanks.
If it is a whole page, why go through the extra layer of fetching from a database table? If it is only part of a page, then I recommend TEXT CHARACTER SET utf8mb4
. Any non-UTF-8 text in the page will cause trouble; may as well catch it early.
And the database industry is converging on using UTF-8 for all text.
Base64 is 8/6 times as big. And all it is doing is avoiding trouble with non-UTF-8 characters that should not be there. If anything, compress it in the client and store in a BLOB
, thereby shrinking 3/1.
In MySQL, TEXT
is 64K bytes. You might be better off with MEDIUMTEXT
, which has a limit of 16MB. I say "bytes" because, for example, Chinese needs 3, sometimes 4, bytes per character, so only about 25K characters of Chinese text will fit in TEXT
.
"DisplaySize is constantly 1431655765" -- What??? A gigabyte for a web page; no way! Even if that includes images (which it should not), that is quite unreasonable. Edit: eggyal's Comment on 2^32/3 probably explains this odd number.
In MySQL, SELECT length(my_text) ...
will get the number of bytes in that column.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With