I want to design a web application for keeping track of the finance of the members of an organization. Certain functions are very similar to Splittr. I define my requirements using the MWE database diagrams:
- "Finance" tables: Each user will have one personal finance account, for which I am using the following three red tables:
- "SharedExpense" tables: Each user can have shared expenses with other users in many 'shared-expense-groups'. For each group, I am using the following three blue tables:
(Note how each user can define amount of their share, and own category of the shared expense. UserShare table uses a composite primary key.)
The problem: I have to relate the users to their 3 personal "Finance" tables and the 3N
"SharedExpense" tables (where N
is the number of 'shared-expense-groups' the user belongs to).
Attempted Solutions:
Multiple databases. Each user has a unique database for their "Finance" tables. Each 'shared-expense-group' has a unique database on the server. I can then relate the users from one master database with the following four purple tables:
Drawbacks: Foreign keys come from different databases, large number of databases to be backed up.
Multiple tables. I can create all the tables in the same database and relate all of them with the four green master tables:
Here, the number of tables is a potential problem. If there are M
users and N
'shared-expense-groups, then there will be 3M + 3N
tables!
The question: Is there more elegant and simpler database design for the purpose? If not, which of the above two solutions is better and why?
Links to relevant, previous StackOverflow Q&A:
- Personal finance app database design
- Database design for tracking progress over time
- SQL for a Household Bill splitting db
- Comparing 1 Database with Many Tables to Multiple Databases with Fewer Tables in Each
There is to much to describe all the challenges in a summary, but I'll pick out a few.
- Fundamental design violations: such as a table/database for each user
- entity design, 3NF: such as category.budget and ledger.transaction_type
- referential integrity/relationship design:
- account is for one user, but account table does not contain the user id;
- usershare is a subset of ledger, but they both point to a user;
- object naming concerns:
- clear and consistent naming entities, based on real usage. Is a member a user or a user a member? If they are the same, choose one name. If they are not the same, the design is different. Do staff use client or customer rather than member?
- consistency in your key naming. The key name should directly tie it to the source entity. Members.ID should be referenced as members_id, rather than user_id. However, see the next entry before correcting this.
- be consistent in your entity plurality. The general consensus is that the name should describe a single record (User) rather than all the records (Users).
- ledger.spent_on - that name is not obviously a date. It could be pointing to a user or category as well. An attribute name should describe the attribute without needing additional explanation. For example, ledger.Purchase_Date is self explanatory. It should also be clear how it relates to the entity. UserShare.Share doesn't really tell me what it contains.
Sorry to be blunt, but I would start over. Consider what you have as a good trial run and start again using the additional information you have.
- Ask questions of your designs (Are all users members? Are all members users?). If the answer is anything other than Yes or No, break it down further.
- Try what-if scenarios (What if a shared ledger exceeds the category budget? How will previous spending be perceived if the category budget changes?)
- Consider what reporting questions may be asked (Who went over budget? How much are we spending on this category?) and then consider the query to answer the question.
Read up on 3NF and maybe some of the higher normalization levels as well. Whereas 3NF is pretty nearly the minimum normalization, the higher levels become increasingly specialized and may or may not be appropriate for you design.
The better you understand your data AND business, the better your design will be, and the better your end product will turn out.