In which situations you should use inherited tables? I tried to use them very briefly and inheritance didn't seem like in OOP world. I thought it worked like this: Table <code>users</code> has all fields required for all user levels. Tables like <code>moderators</code>, <code>admins</code>, <code>bloggers</code>, etc but fields are not checked from parent. For example <code>users</code> has email field and inherited <code>bloggers</code> has it now too but it's not unique for both <code>users</code> and <code>bloggers</code> at the same time. ie. same as I add email field to both tables. The only usage I could think of is fields that are usually used, like row_is_deleted, created_at, modified_at. Is this the only usage for inherited tables?

There are some major reasons for using table inheritance in postgres. Let's say, we have some tables needed for statistics, which are created and filled each month: <pre class="prettyprint"><code>statistics - statistics_2010_04 (inherits statistics) - statistics_2010_05 (inherits statistics) </code></pre> In this sample, we have 2.000.000 rows in each table. Each table has a CHECK constraint to make sure only data for the matching month gets stored in it. So what makes the inheritance a cool feature - why is it cool to split the data? <ul> <li>PERFORMANCE: When selecting data, we SELECT * FROM statistics WHERE date BETWEEN x and Y, and Postgres only uses the tables, where it makes sense. Eg. SELECT * FROM statistics WHERE date BETWEEN '2010-04-01' AND '2010-04-15' only scans the table statistics_2010_04, all other tables won't get touched - fast!</li> <li>Index size: We have no big fat table with a big fat index on column date. We have small tables per month, with small indexes - faster reads.</li> <li>Maintenance: We can run vacuum full, reindex, cluster on each month table without locking all other data</li> </ul> For the correct use of table inheritance as a performance booster, look at the postgresql manual. You need to set CHECK constraints on each table to tell the database, on which key your data gets split (partitioned). I make heavy use of table inheritance, especially when it comes to storing log data grouped by month. Hint: If you store data, which will never change (log data), create or indexes with CREATE INDEX ON () WITH(fillfactor=100); This means no space for updates will be reserved in the index - index is smaller on disk. UPDATE: <code>fillfactor</code> default is 100, from http://www.postgresql.org/docs/9.1/static/sql-createtable.html: The <code>fillfactor</code> for a table is a percentage between 10 and 100. 100 (complete packing) is the default

When to use inherited tables in PostgreSQL?

Tags:

postgresql

In which situations you should use inherited tables? I tried to use them very briefly and inheritance didn't seem like in OOP world.

I thought it worked like this:

Table users has all fields required for all user levels. Tables like moderators, admins, bloggers, etc but fields are not checked from parent. For example users has email field and inherited bloggers has it now too but it's not unique for both users and bloggers at the same time. ie. same as I add email field to both tables.

The only usage I could think of is fields that are usually used, like row_is_deleted, created_at, modified_at. Is this the only usage for inherited tables?

461

asked Jun 19 '10 06:06

raspi

2 Answers

"Table inheritance" means something different than "class inheritance" and they serve different purposes.

Postgres is all about data definitions. Sometimes really complex data definitions. OOP (in the common Java-colored sense of things) is about subordinating behaviors to data definitions in a single atomic structure. The purpose and meaning of the word "inheritance" is significantly different here.

In OOP land I might define (being very loose with syntax and semantics here):

import life  class Animal(life.Autonomous):   metabolism = biofunc(alive=True)    def die(self):     self.metabolism = False  class Mammal(Animal):   hair_color = color(foo=bar)    def gray(self, mate):     self.hair_color = age_effect('hair', self.age)  class Human(Mammal):   alcoholic = vice_boolean(baz=balls)

The tables for this might look like:

CREATE TABLE animal   (name       varchar(20) PRIMARY KEY,    metabolism boolean NOT NULL);  CREATE TABLE mammal   (hair_color  varchar(20) REFERENCES hair_color(code) NOT NULL,    PRIMARY KEY (name))   INHERITS (animal);  CREATE TABLE human   (alcoholic  boolean NOT NULL,    FOREIGN KEY (hair_color) REFERENCES hair_color(code),    PRIMARY KEY (name))   INHERITS (mammal);

But where are the behaviors? They don't fit anywhere. This is not the purpose of "objects" as they are discussed in the database world, because databases are concerned with data, not procedural code. You could write functions in the database to do calculations for you (often a very good idea, but not really something that fits this case) but functions are not the same thing as methods -- methods as understood in the form of OOP you are talking about are deliberately less flexible.

There is one more thing to point out about inheritance as a schematic device: As of Postgres 9.2 there is no way to reference a foreign key constraint across all of the partitions/table family members at once. You can write checks to do this or get around it another way, but its not a built-in feature (it comes down to issues with complex indexing, really, and nobody has written the bits necessary to make that automatic). Instead of using table inheritance for this purpose, often a better match in the database for object inheritance is to make schematic extensions to tables. Something like this:

CREATE TABLE animal   (name       varchar(20) PRIMARY KEY,    ilk        varchar(20) REFERENCES animal_ilk NOT NULL,    metabolism boolean NOT NULL);  CREATE TABLE mammal   (animal      varchar(20) REFERENCES animal PRIMARY KEY,    ilk         varchar(20) REFERENCES mammal_ilk NOT NULL,    hair_color  varchar(20) REFERENCES hair_color(code) NOT NULL);   CREATE TABLE human   (mammal     varchar(20) REFERENCES mammal PRIMARY KEY,    alcoholic  boolean NOT NULL);

Now we have a canonical reference for the instance of the animal that we can reliably use as a foreign key reference, and we have an "ilk" column that references a table of xxx_ilk definitions which points to the "next" table of extended data (or indicates there is none if the ilk is the generic type itself). Writing table functions, views, etc. against this sort of schema is so easy that most ORM frameworks do exactly this sort of thing in the background when you resort to OOP-style class inheritance to create families of object types.

answered Sep 23 '22 21:09

zxq9

There are some major reasons for using table inheritance in postgres.

Let's say, we have some tables needed for statistics, which are created and filled each month:

statistics     - statistics_2010_04 (inherits statistics)     - statistics_2010_05 (inherits statistics)

In this sample, we have 2.000.000 rows in each table. Each table has a CHECK constraint to make sure only data for the matching month gets stored in it.

So what makes the inheritance a cool feature - why is it cool to split the data?

PERFORMANCE: When selecting data, we SELECT * FROM statistics WHERE date BETWEEN x and Y, and Postgres only uses the tables, where it makes sense. Eg. SELECT * FROM statistics WHERE date BETWEEN '2010-04-01' AND '2010-04-15' only scans the table statistics_2010_04, all other tables won't get touched - fast!
Index size: We have no big fat table with a big fat index on column date. We have small tables per month, with small indexes - faster reads.
Maintenance: We can run vacuum full, reindex, cluster on each month table without locking all other data

For the correct use of table inheritance as a performance booster, look at the postgresql manual. You need to set CHECK constraints on each table to tell the database, on which key your data gets split (partitioned).

I make heavy use of table inheritance, especially when it comes to storing log data grouped by month. Hint: If you store data, which will never change (log data), create or indexes with CREATE INDEX ON () WITH(fillfactor=100); This means no space for updates will be reserved in the index - index is smaller on disk.

UPDATE: fillfactor default is 100, from http://www.postgresql.org/docs/9.1/static/sql-createtable.html:

The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default

answered Sep 26 '22 21:09

S38

Related questions
                            
                                PostgreSQL IF statement
                            
                                Postgres error "invalid value for parameter "TimeZone": "UTC""
                            
                                PG::Error: ERROR: new encoding (UTF8) is incompatible
                            
                                How do you get PyPy, Django and PostgreSQL to work together?
                            
                                HikariCP - connection is not available
                            
                                How to create an SQL View with SQLAlchemy?
                            
                                Copy a table (including indexes) in postgres
                            
                                Remove uniqueness of index in PostgreSQL
                            
                                How to drop multiple tables in PostgreSQL using a wildcard
                            
                                How to create a temporary function in PostgreSQL?
                            
                                Changing a column type to longer strings in rails
                            
                                How to delete duplicate entries?
                            
                                How to insert current datetime in postgresql insert query [duplicate]
                            
                                How to hide result set decoration in Psql output
                            
                                ERROR: permission denied for relation tablename on Postgres while trying a SELECT as a readonly user
                            
                                Insert data in 3 tables at a time using Postgres
                            
                                Default database named postgres on Postgresql server
                            
                                PostgreSQL: Why psql can't connect to server?
                            
                                What does it mean when a PostgreSQL process is "idle in transaction"?
                            
                                PostgreSQL: ERROR: operator does not exist: integer = character varying

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With