Is it possible to plot the tables in a postgresql database and their relationships using R like shown below?
If there are foreign keys between the tables, then you can find the relationship between them. To do this, you can call \d on a table and see the foreign keys associated with its columns.
Yes it is possible.
As for how it is possible, see the steps below
Steps
Step 1
For connection to PostgreSQL database from R there are various mechanisms of doing so including
An example of Step 1 in RPostgreSQL is below:
library(RPostgreSQL)
## loads the PostgreSQL driver
drv <- dbDriver("PostgreSQL")
## Open a connection
con <- dbConnect(drv, dbname="databasename")
Step 2
This can be done in several ways. It can be done directly in SQL, or it can be done using
RPostgreSQL's dbListTables and dbListFields or a combination of the two.
For example SQL to query all tables in a database, or all fields / columns in a table or all constraints in a table see the following StackOverflow answers
In summary you just query information_schema.tables, information_schema.columns and information_schema.table_constraints for the information you need. You can use the PostgreSQL specific tables rather than the ANSI SQL standard tables, if speed is an issue (they are mentioned in the linked answers above), but they may change over time.
The steps here are
An example of Step 2 in RPostgreSQL is below:
Adjust your SQL to suit.
Part1
For getting list of tables
Using built-in function
tables1 <- dbListTables(con)
Using SQL
tables2 <- dbGetQuery(con, "select table_name from information_schema.tables")
Part 2
Use built in function
You would use dbListFields(con,"TableName"), with apply over the previous data frame of tables. See how to apply a function to every row of a matrix (or a data frame) in R or Apply a function to each row in a data frame in R and save the result to a variable.
Using SQL
columns2 <- dbGetQuery(con, "select table_name,column_name from information_schema.columns")
Part 3
Using SQL
constraints <- dbGetQuery(con, "select table_name,constraint_name, constraint_type from information_schema.table_constraints")
Step 3
From step 2, you should have list of tables, a list of tables and their associated fields / columns, and list of tables and their associated constraints.
You either need to output a csv file for CityPlot to use , or a dot file for GraphViz, or igraph's graph format or a data frame or hash in order to process using functions which draw your tables and connections between them using grid or diagram.
If you are combining them into a single dataframe, subset and merge will be useful.
Step 4
This step can also be done in many different ways. These include but are not limited to
If using the diagram, shape or grid packages, you would iterate over the list of tables, or the hash or other data structure, and apply a draw function on each table, and then have a separate function that is applied for each constraint to draw the lines.
References
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With