Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database performance: Using one entity/table with the max. possible properties or split to different entities/tables?

im need to design some database tables but im not sure about the performance impact. In my case its more about the read performance than for saving the data.

The situation

With the help of pattern recognition im finding out how many values of a certain object needs to be saved in my postgresql database. Amount other lets say fixed properties the only difference is if 1, 2 or 3 values of the same type needs to be saved.

Currently im having 3 entities/tables which differ only in having having 1, 2 or 3 not nullable properties of the same type.

For example:

EntityTestOne/TableOne {
    ... other (same) properties
    String optionOne;
}

EntityTestTwo/TableTwo {
    ... other (same) properties
    String optionOne;
    String optionTwo;

}

EntityTestThree/TableThree {
    ... other (same) properties
    String optionOne;
    String optionTwo;
    String optionThree;
}

I expect to have several million records in production and im thinking what could be the performance impact of this variant and what could be alternatives.

Alternatives

Other options which come into my mind:

  • Use only one entity class or table with 3 options (optionTwo and optionThree will be nullable then). If to talk of millions of expected records plus caching im asking myself isn't it a kind of 'waste' to save millions of null values in at least two (caching) layers (database itself and hibernate). In a another answer i read yesterday saving a null value in postgresql need only 1 bit what i think isnt that much if we talk about several millions of records which can contain some nullable properties (link).
  • Create another entity/table and use a collection (list or set) relationship instead

For example:

EntityOption {
    String value;
}

EntityTest {
    ... other (same) properties
    List<EntityOption> options;
}
  • If to use this relationship: What would give a better performance in case of creating new records: Creating for every new EntityTest new EntityOption's or doing a lookup before and reference a existing EntityOption if exists? What about the read performance while fetching them later and the joins which will be needed then? Compared to the variant with one plain Entity with three options i can imagine it could be slightly slower...

As im not that strong in database design and working with hibernate im interested of the pros and cons of these approaches and if there are even more alternatives. I even would like to ask the question if postgresql is the right choice for this or if should think about using another (free) database.

Thanks!

like image 371
StephanM Avatar asked May 11 '17 08:05

StephanM


2 Answers

The case is pretty clear in my opinion: If you have an upper limit of three properties per object, use a single table with nullable attributes.

A NULL value does not take up any space in the database. For every row, PostgreSQL stores a bitmap that contains which attributes are NULL. This bitmap is always stored, except when all attributes are not nullable. See the documentation for details.
So don't worry about storage space in this case.

Using three different tables or storing the attributes in a separate table will probably lead to UNIONs or JOINs in your queries, which will make the queries more complicated and slow.

like image 76
Laurenz Albe Avatar answered Oct 22 '22 13:10

Laurenz Albe


There are many inheritance strategy for creating entity class, I think you should go with single table strategy, where there will be a discriminator column (managed by hibernate itself), and all common filed will be used by each entity and some specific fields will be use by specific entity and remain null for other entity. This will get improved read performance. For your ref. : http://www.thejavageek.com/2014/05/14/jpa-single-table-inheritance-example/

like image 1
Prateek Singh Avatar answered Oct 22 '22 11:10

Prateek Singh