I'm having some troubles deciding on which approach to use. I have several entity "types", let's call them A,B and C, who share a certain number of attributes (about 10-15). I created a table called ENTITIES, and a column for each of the common attributes. A,B,C also have some (mostly)unique attributes (all boolean, can be 10 to 30 approx). I'm unsure what is the best approach to follow in modelling the tables: <ol> <li>Create a column in the ENTITIES table for each attribute, meaning that entity types that don't share that attribute will just have a null value.</li> <li>Use separate tables for the unique attributes of each entity type, which is a bit harder to manage.</li> <li>Use an hstore column, each entity will store its unique flags in this column.</li> <li>???</li> </ol> I'm inclined to use 3, but I'd like to know if there's a better solution.

<h3>(4) Inheritance </h3> The cleanest style from a database-design point-of-view would probably be inheritance, like @yieldsfalsehood suggested in his comment. Here is an example with more information, code and links: Select (retrieve) all records from multiple schemas using Postgres The current implementation of inheritance in Postgres has a number of limitations, though. Among others, you cannot define a common foreign key constraints for all inheriting tables. Read the last chapter about caveats carefully. <h3>(3) <code>hstore</code>, <code>json</code> (pg 9.2+) / jsonb (pg 9.4+)</h3> A good alternative for lots of different or a changing set of attributes, especially since you can even have functional indices on attributes inside the column: <ul> <li>unique index or constraint on hstore key</li> <li>Index for finding an element in a JSON array</li> <li><code>jsonb</code> indexing in Postgres 9.4</li> </ul> EAV type of storage has its own set of advantages and disadvantages. This question on dba.SE provides a very good overview. <h3>(1) One table with lots of columns</h3> It's the simple, kind of brute-force alternative. Judging from your description, you would end up with around 100 columns, most of them boolean and most of them <code>NULL</code> most of the time. Add a column <code>entity_id</code> to mark the type. Enforcing constraints per type is a bit awkward with lots of columns. I wouldn't bother with too many constraints that might not be needed. The maximum number of columns allowed is 1600. With most of the columns being NULL, this upper limit applies. As long as you keep it down to 100 - 200 columns, I wouldn't worry. NULL storage is very cheap in Postgres (basically 1 bit per column, but it's more complex than that.). That's only like 10 - 20 bytes extra per row. Contrary to what one might assume (!), most probably much smaller on disk than the <code>hstore</code> solution. While such a table looks monstrous to the human eye, it is no problem for Postgres to handle. RDBMSes specialize in brute force. You might define a set of views (for each type of entity) on top of the base table with just the columns of interest and work with those where applicable. That's like the reverse approach of inheritance. But this way you can have common indexes and foreign keys etc. Not that bad. I might do that. All that said, the decision is still yours. It all depends on the details of your requirements.

Use case for hstore against multiple columns

1 Answers

(4) Inheritance

The cleanest style from a database-design point-of-view would probably be inheritance, like @yieldsfalsehood suggested in his comment. Here is an example with more information, code and links:
Select (retrieve) all records from multiple schemas using Postgres

The current implementation of inheritance in Postgres has a number of limitations, though. Among others, you cannot define a common foreign key constraints for all inheriting tables. Read the last chapter about caveats carefully.

(3) `hstore`, `json` (pg 9.2+) / jsonb (pg 9.4+)

A good alternative for lots of different or a changing set of attributes, especially since you can even have functional indices on attributes inside the column:

unique index or constraint on hstore key
Index for finding an element in a JSON array
jsonb indexing in Postgres 9.4

EAV type of storage has its own set of advantages and disadvantages. This question on dba.SE provides a very good overview.

(1) One table with lots of columns

It's the simple, kind of brute-force alternative. Judging from your description, you would end up with around 100 columns, most of them boolean and most of them NULL most of the time. Add a column entity_id to mark the type. Enforcing constraints per type is a bit awkward with lots of columns. I wouldn't bother with too many constraints that might not be needed.

The maximum number of columns allowed is 1600. With most of the columns being NULL, this upper limit applies. As long as you keep it down to 100 - 200 columns, I wouldn't worry. NULL storage is very cheap in Postgres (basically 1 bit per column, but it's more complex than that.). That's only like 10 - 20 bytes extra per row. Contrary to what one might assume (!), most probably much smaller on disk than the hstore solution.

While such a table looks monstrous to the human eye, it is no problem for Postgres to handle. RDBMSes specialize in brute force. You might define a set of views (for each type of entity) on top of the base table with just the columns of interest and work with those where applicable. That's like the reverse approach of inheritance. But this way you can have common indexes and foreign keys etc. Not that bad. I might do that.

All that said, the decision is still yours. It all depends on the details of your requirements.

104

answered Sep 19 '22 03:09

Erwin Brandstetter

Related questions
                            
                                PostgreSQL - bind variables and date addition
                            
                                How can I check a Python unicode string to see that it *actually* is proper Unicode?
                            
                                postgres group by aggregate function
                            
                                When is it a good idea to move columns off a main table into an auxiliary table?
                            
                                Whats does pg_escape_string exactly do?
                            
                                Migrating a password field to Django
                            
                                Ordering query result by numeric strings in django (postgres backend)
                            
                                Insert NULL instead of empty string with PDO
                            
                                PostgreSQL error cache lookup failed for relation - what causes it and why?
                            
                                SELECT FROM a function returning a record with arbirary number of columns
                            
                                Create an Array with a dynamic Query in PL/pgSQL
                            
                                How to apply pagination to the result of a SQL query with Joins?
                            
                                Postgres - the last version 0.14.0 of the "pg" gem gives error
                            
                                Insert values with single quotation in PostgreSQL
                            
                                How to change default null sorting behavior from PostgreSQL in the Django ORM
                            
                                Is it possible to have a Heroku Postgres DB replicate down to a slave DB on my laptop?
                            
                                Setting enable_seqscan = off in a single SELECT query
                            
                                How can I find out how big a large TEXT field is in Postgres?
                            
                                Joining tables if the reference exists
                            
                                Grails 2 - Cannot create spring security domain object

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Use case for hstore against multiple columns

Tags:

postgresql

database-design

hstore

Trasplazio Garzuglio

People also ask

1 Answers

(4) Inheritance

(3) `hstore`, `json` (pg 9.2+) / jsonb (pg 9.4+)

(1) One table with lots of columns

Erwin Brandstetter

Recent Activity

Donate For Us

Use case for hstore against multiple columns

Tags:

postgresql

database-design

hstore

Trasplazio Garzuglio

People also ask

1 Answers

(4) Inheritance

(3) hstore, json (pg 9.2+) / jsonb (pg 9.4+)

(1) One table with lots of columns

Erwin Brandstetter

Related questions

Recent Activity

Donate For Us

(3) `hstore`, `json` (pg 9.2+) / jsonb (pg 9.4+)