I was planning to move from MySQL to PostgreSQL because I wanted to make use of TimescaleDB.
Everything looked good until I did a check on the storage size used by PostgreSQL (v11.2) compared to MySQL (v5.6). For exactly the same number of rows (1,440,000) and content:
The MySQL and PostgreSQL numbers are like for like (i.e. including indexes and other constraints), the PostgreSQL + TimescaleDB has overhead of adding a timestamp to the table. The table concerned looks like this:
create table cell(
cell_id serial not null
,ts timestamp not null
,parent_id int references parent( parent_id )
,instance_id smallint
,v float
,a float
,t float
,s1 float
,s2 float
,s3 float
,s4 float
,s5 float
,primary key( cell_id )
);
create index ix_cell_pid on cell( parent_id );
create index ix_cell_inst on cell( instance_id );
Why does PostgreSQL occupy so much more storage than MySQL?
And is there some way of significantly reducing it near to the MySQL level?
Adding a timestamp
column should add no more than 11 MB in your case (1440000 * 8 bytes, no added padding).
Have you run VACUUM FULL
in Postgres before you measured the size, for a fair comparison? I suspect table and index bloat.
Related:
In MySQL the data type float
is the single-precision floating point type occupying 4 bytes.
In Postgres the same float
is the double-precision floating point type, occupying 8 bytes (alias: float8
or double precision
).
That should explain another 44 MB of the difference. To compare apples with apples, create the Postgres table with 4-byte real
columns (alias float4
). Note the difference to MySQL, where real
is used for 8-byte floating point numbers! Unfortunate disagreements.
The MySQL manual: https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html
The Postgres manual: https://www.postgresql.org/docs/current/datatype-numeric.html
Related:
You show two indexes. Depending on what those are for, one multicolumn index might be able to replace both in Postgres - occupying just as much disk space as one of the two it replaces in this particular case (saving ~ 50 MB with the given specs).
create index ix_cell_pid on cell( parent_id, instance_id );
Consider:
In addition to Erwin Brandstetter's answer:
There are also the usually hidden system columns to take account of (used to implement MVCC) along with other tuple information such as hint bits (used to cache row visibility).
You can use the pageinspect extension to look at this information, e.g.:
create extension pageinspect;
select * from heap_page_items(get_raw_page('cell', 0));
See here for a description of the system columns.
Your indexes, too, will contain empty space to allow further tuples to be inserted more quickly. Look for "fillfactor" in the CREATE INDEX docs.
connect to the database ( in psql, type \c your-db-name
)
checkout how many size each table cost. refer to: https://wiki.postgresql.org/wiki/Disk_Usage
SELECT nspname || '.' || relname AS "relation",
pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
AND C.relkind <> 'i'
AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size(C.oid) DESC
LIMIT 100;
delete ...
truncte <table_name> cascade;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With