Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database schema question

I am designing a data model for a local city page, more like requirements for it.

So 4 tables: Country, State, City, neighbourhood.

Real life relationships is: Country owns multiple State which owns multiple cities which ows multiple neighbourhoods.

In the data model: Do we link these with FK the same way or link each with each? Like in each table there will even be a Country ID, State ID, CityID and NeighbourhoodID so each connected with each? Other wise to reach neighbourhood from country we need to join 2 other tables in between?

There are more tables I need to maintain for IP addess of cities, latitude/longitude, etc.

like image 757
Thomas Avatar asked Jun 16 '10 06:06

Thomas


People also ask

What is an example of a database schema?

The first is an RDBMS, or a relational database management system. Examples of RDBMSs include SQL Server, MySQL, and PostgreSQL. The second and most important tool is SQL (structured query language). This is the language we use to define the tables, columns, indexes, and other artifacts in our database.

What are the 3 types of database schema?

Schema is of three types: Logical Schema, Physical Schema and view Schema. Logical Schema – It describes the database designed at logical level. Physical Schema – It describes the database designed at physical level. View Schema – It defines the design of the database at the view level.

What are the 2 main types of database schemas?

There are two main kinds of database schema: A logical database schema conveys the logical constraints that apply to the stored data. It may define integrity constraints, views, and tables. A physical database schema lays out how data is stored physically on a storage system in terms of files and indices.


1 Answers

The closest thing there is to an industry standard is this: each dependent table is linked by a foreign key to its immediate parent:

create table country
(country_id number not null
 , country_name varchar2(30)
 , constraint country_pk primary key (country_id)
 )
/
create table state
(state_id number not null 
 , state_name varchar2(30)
 , country_id number not null
 , constraint state_pk primary key (state_id)
 , constraint state_country_fk foreign key (country_id)
        references country(country_id)
 )
/
create table city
(city_id number not null 
 , city_name varchar2(30)
 , state_id number not null 
 , constraint city_pk primary key (city_id)
 , constraint city_state_fk foreign key (state_id)
        references state(state_id)
 )
/
create table neighbourhood
(neighbourhood_id number not null 
 , neighbourhood_name varchar2(30)
 , city_id number not null 
 , constraint neighbourhood_pk primary key (neighbourhood_id)
 , constraint neighbourhood_city_fk foreign key (city_id)
        references city(city_id)
 )
/

An alternative approach, which has largely fallen out of favour, is to define the primary keys of the child tables as compound keys including the keys of the immediate parent table:

create table state
(country_id number not null
 , state_id number not null 
 , state_name varchar2(30)
 , constraint state_pk primary key (country_id, state_id)
 , constraint state_country_fk foreign key (country_id)
        references country(country_id)
 )
/
create table city
(country_id number not null
 , state_id number not null 
 , city_id number not null 
 , city_name varchar2(30)
 , constraint city_pk primary key (country_id, state_id, city_id)
 , constraint city_state_fk foreign key (country_id, state_id)
        references state(country_id, state_id)
 )
/
create table neighbourhood
(country_id number not null
 , state_id number not null 
 , city_id number not null 
 , neighbourhood_id number not null 
 , neighbourhood_name varchar2(30)
 , constraint neighbourhood_pk primary key (country_id, state_id, city_id, neighbourhood_id)
 , constraint neighbourhood_city_fk foreign key (country_id, state_id, city_id)
        references city(country_id, state_id, city_id)
 )
/

This approach is deprecated because in the short term it creates exceedingly unwieldy joins and in the long term it creates horrible messes when the keys change. Primary keys are not supposed to change, but compounding them creates meaning. Consequently, when the system's data changes - say there's a state boundary re-organisation - changes to a whole bunch of cities have to be cascaded to the Neighbourhood table, and any other children. Yuck.

Your proposal is an alternate version of this:

create table state
(state_id number not null 
 , state_name varchar2(30)
 , country_id number not null
 , constraint state_pk primary key (state_id)
 , constraint state_country_fk foreign key (country_id)
        references country(country_id)
 )
/
create table city
(city_id number not null 
 , city_name varchar2(30)
 , country_id number not null
 , state_id number not null 
 , constraint city_pk primary key (city_id)
 , constraint city_country_fk foreign key (country_id)
        references country(country_id)
 , constraint city_state_fk foreign key (state_id)
        references state(state_id)
 )
/
create table neighbourhood
(neighbourhood_id number not null 
 , neighbourhood_name varchar2(30)
 , country_id number not null
 , state_id number not null 
 , city_id number not null 
 , constraint neighbourhood_pk primary key (neighbourhood_id)
 , constraint neighbourhood_country_fk foreign key (country_id)
        references country(country_id)
 , constraint neighbourhood_state_fk foreign key (state_id)
        references state(state_id)
 , constraint neighbourhood_city_fk foreign key (city_id)
        references city(city_id)
 )
/

It avoids the compound keys but you still have that cascading data problem. Also it violates relational practice by introducing foreign keys for relationships which don't exist (there is no direct relationship between Neighbourhood and Country, it is implied through the intermediate linkages).

On the plus side, as you point out, this can be very helpful for running queries which want to return Neighbourhoods for a given Country. I have worked on one system where this was useful (actually it employed inherited compound keys, but the principle is the same). However, this was a very specialist data warehouse and even then the queries I ran were admin/developer queries rather than application ones. Unless you are dealing with huge amounts of data (millions of neighbourhoods) I think the performance gain from skipping a couple of joins would not be worth the overhead of managing those additional columns.

In short, use the first approach: it's neat and it's standard.

edit

"State should be optional though since not all countries have a state. Then a Country will connect with city directly."

If true, that changes everything. Obviously STATE cannot be used as an identifying foreign key for CITY. So CITY must reference COUNTRY instead. STATE can be an optional lookup on CITY.

Although I think most countries do have some equivalent sub-division such as counties or departments. Even the microstates like Lichtenstein and San Marino have municipalities (Monaco has just the one). Perhaps the only nation which doesn't is Vatican City. So consider carefully whether to structure your data model to support one or two edge cases, or to munge the data by injecting an artificial "State" for exceptions such as the Holy See. Neither approach is entirely satisfactory.

"all these fields will be auto-complete fields so not sure if that alters the table structure in anyway?"

Makes no difference.

"But who knows, few months later we may discover some cool feature that may need country to match with neighbourhoods too."

Yes but then again you may not. XP has a powerful principle called YAGNI - You're aren't gonna need it. Basically, don't do a lot of work and complicate your design for the sake of some putative future requirement which might never arrive.

And if it does arrive then the first solution would be to join NEIGHBOURHOOD and COUNTRY through the intermediate tables (or table, if you don't use STATE as the referenced for CITY). Only if the performance of that query is Teh Suck! and it it stubbornly resists tuning should you consider tweaking the data model.

like image 103
APC Avatar answered Sep 30 '22 14:09

APC