My task is to compare the schemas of several databases in a cluster of our app in Postgres against the same databases of a different version of our app.
The comparison should only check the schema and not data.
The most basic way would be for me to use subproccess to execute
pg_dump -U <user> -s <database> > schema.txt
for each database, then run a diff.
Two questions
1) Is this the right approach to see if the schema has changed? 2) Is this possible through psycopg2, without using subproccess or pg_dump or psql?
Thanks!
pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers). pg_dump only dumps a single database.
One caveat: pg_dump does not dump roles or other database objects including tablespaces, only a single database. To take backups on your entire PostgreSQL cluster, pg_dumpall is the better choice. pg_dumpall can handle the entire cluster, backing up information on roles, tablespaces, users, permissions, etc…
pg_dump creates a logical backup, that is a series of SQL statements that, when executed, create a new database that is logically like the original one. pg_basebackup creates a physical backup, that is a copy of the files that constitute the database cluster. You have to use recovery to make such a backup consistent.
pg_dump and pg_restore is a native PostgreSQL client utility. You can find this utility as part of the database installation. It produces a set of SQL statements that you can run to reproduce the original database object definitions and table data.
It depends on what is meant by if the schema has changed. If that literally means if there's any difference at all between the schemas, no matter how significant, then yes, you would want to dump the schemas and then compare them.
For this task, you would definitely want to use pg_dump
with the --schema-only
option. There isn't an SQL statement that does this, so doing this directly via psycopg2 isn't possible. (There are lower-level Postgres library functions that are available that allow programs like PgAdmin3 to show all the DDL for tables and such, but I believe that would need to be called directly via libpq).
However, if what you want to find out is some other type of difference, it may be possible (if not somewhat more involved) to do it via psycopg2 by querying the various system catalog tables.
For example, you could query pg_tables to determine what tables are present, and compare those to see if any are missing. You could then query those tables directly to get more information, say, to check to see if the counts are the same, etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With