Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

list of schema with sizes (relative and absolute) in a PostgreSQL database

Tags:

postgresql

I'm looking for a query that returns a result of the form for any database (see example below supposing total space used by the database is 40GB)

schema | size | relative size ----------+------------------- foo    | 15GB |   37.5%       bar    | 20GB |     50% baz    |  5GB |   12.5% 

I've managed to concoct a list of space using entities in the database sorted by schema, which has been useful, but getting a summary per schema from this doesn't look so easy. See below.

SELECT relkind,        relname,        pg_catalog.pg_namespace.nspname,        pg_size_pretty(pg_relation_size(pg_catalog.pg_class.oid)) FROM   pg_catalog.pg_class        INNER JOIN pg_catalog.pg_namespace          ON relnamespace = pg_catalog.pg_namespace.oid ORDER  BY pg_catalog.pg_namespace.nspname,           pg_relation_size(pg_catalog.pg_class.oid) DESC; 

This gives results like

  relkind |                relname                |      nspname       | pg_size_pretty  ---------+---------------------------------------+--------------------+----------------   r       | geno                                  | btsnp              | 11 GB   i       | geno_pkey                             | btsnp              | 5838 MB   r       | anno                                  | btsnp              | 63 MB   i       | anno_fid_key                          | btsnp              | 28 MB   i       | ix_btsnp_anno_rsid                    | btsnp              | 28 MB   [...]   r       | anno                                  | btsnp_shard        | 63 MB   r       | geno4681                              | btsnp_shard        | 38 MB   r       | geno4595                              | btsnp_shard        | 38 MB   r       | geno4771                              | btsnp_shard        | 38 MB   r       | geno4775                              | btsnp_shard        | 38 MB 

It looks like using an aggregation operator like SUM may be necessary, no success with that thus far.

like image 724
Faheem Mitha Avatar asked Dec 11 '10 19:12

Faheem Mitha


People also ask

How do I find the PostgreSQL database schema?

If you want to see the entire database structure in one go, type \d * in the psql console. That gives schema of all tables in that database.

What is the psql command to show the schema of a relation?

To show the name of the current schema, use the following simple command. >> SELECT current_schema(); This shows that the current schema is “public”.

How do I find the schema size in Greenplum?

How to check the Schema size? psql> select schemaname ,round(sum(pg_total_relation_size(schemaname||'. '||tablename))/1024/1024) "Size_MB" from pg_tables where schemaname='SCHEMANAME' group by 1; Replace SCHEMANAME with your schema name.


2 Answers

Try this:

SELECT schema_name,         sum(table_size),        (sum(table_size) / database_size) * 100 FROM (   SELECT pg_catalog.pg_namespace.nspname as schema_name,          pg_relation_size(pg_catalog.pg_class.oid) as table_size,          sum(pg_relation_size(pg_catalog.pg_class.oid)) over () as database_size   FROM   pg_catalog.pg_class      JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid ) t GROUP BY schema_name, database_size


Edit: just noticed the workaround with summing up all tables to get the database size is not necessary:

 SELECT schema_name,         pg_size_pretty(sum(table_size)::bigint),        (sum(table_size) / pg_database_size(current_database())) * 100 FROM (   SELECT pg_catalog.pg_namespace.nspname as schema_name,          pg_relation_size(pg_catalog.pg_class.oid) as table_size   FROM   pg_catalog.pg_class      JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid ) t GROUP BY schema_name ORDER BY schema_name 
like image 167
a_horse_with_no_name Avatar answered Oct 09 '22 22:10

a_horse_with_no_name


Better solution:

WITH   schemas AS ( SELECT schemaname as name, sum(pg_relation_size(quote_ident(schemaname) || '.' || quote_ident(tablename)))::bigint as size FROM pg_tables GROUP BY schemaname ),  db AS ( SELECT pg_database_size(current_database()) AS size )  SELECT schemas.name,         pg_size_pretty(schemas.size) as absolute_size,        schemas.size::float / (SELECT size FROM db)  * 100 as relative_size FROM schemas; 

The accepted answer solves the described problem, but the suggested query is not efficient. You can do EXPLAIN to see the difference:

EXPLAIN WITH   schemas AS ( SELECT schemaname as name, sum(pg_relation_size(quote_ident(schemaname) || '.' || quote_ident(tablename)))::bigint as size FROM pg_tables GROUP BY schemaname ),  db AS (SELECT pg_database_size(current_database()) AS size)  SELECT schemas.name,         pg_size_pretty(schemas.size) as absolute_size,        schemas.size::float / (SELECT size FROM db)  * 100 as relative_size FROM schemas;                                                   QUERY PLAN ------------------------------------------------------------------------------------------------------------  CTE Scan on schemas  (cost=47100.79..47634.34 rows=16417 width=104)    CTE schemas      ->  Finalize HashAggregate  (cost=46854.50..47100.76 rows=16417 width=72)            Group Key: n.nspname            ->  Gather  (cost=43119.63..46608.25 rows=32834 width=96)                  Workers Planned: 2                  ->  Partial HashAggregate  (cost=42119.63..42324.85 rows=16417 width=96)                        Group Key: n.nspname                        ->  Hash Left Join  (cost=744.38..39763.93 rows=94228 width=128)                              Hash Cond: (c.relnamespace = n.oid)                              ->  Parallel Seq Scan on pg_class c  (cost=0.00..38772.14 rows=94228 width=72)                                    Filter: (relkind = ANY ('{r,p}'::"char"[]))                              ->  Hash  (cost=539.17..539.17 rows=16417 width=68)                                    ->  Seq Scan on pg_namespace n  (cost=0.00..539.17 rows=16417 width=68)    CTE db      ->  Result  (cost=0.00..0.01 rows=1 width=8)    InitPlan 3 (returns $3)      ->  CTE Scan on db  (cost=0.00..0.02 rows=1 width=8) 

vs

EXPLAIN SELECT schema_name,         pg_size_pretty(sum(table_size)::bigint),        (sum(table_size) / pg_database_size(current_database())) * 100 FROM (   SELECT pg_catalog.pg_namespace.nspname as schema_name,          pg_relation_size(pg_catalog.pg_class.oid) as table_size   FROM   pg_catalog.pg_class      JOIN pg_catalog.pg_namespace ON relnamespace = pg_catalog.pg_namespace.oid ) t GROUP BY schema_name ORDER BY schema_name;                                          QUERY PLAN -------------------------------------------------------------------------------------------  GroupAggregate  (cost=283636.24..334759.75 rows=1202906 width=128)    Group Key: pg_namespace.nspname    ->  Sort  (cost=283636.24..286643.51 rows=1202906 width=72)          Sort Key: pg_namespace.nspname          ->  Hash Join  (cost=744.38..51446.15 rows=1202906 width=72)                Hash Cond: (pg_class.relnamespace = pg_namespace.oid)                ->  Seq Scan on pg_class  (cost=0.00..44536.06 rows=1202906 width=8)                ->  Hash  (cost=539.17..539.17 rows=16417 width=68)                      ->  Seq Scan on pg_namespace  (cost=0.00..539.17 rows=16417 width=68) 
like image 21
k-sever Avatar answered Oct 09 '22 23:10

k-sever