Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a data model for invoices in a large data warehouse?

I am creating a data model for customer invoices in a large data warehouse.

The following shows the fields on a typical invoice:

enter image description here


The following is the data model I worked out so far to model the invoices:

enter image description here

Conventional wisdom is that a large data warehouse should use a star schema, which means one fact table, but it seems that to model an invoice I would need two fact tables, as shown above. Would it be correct to use two fact tables?

like image 458
davidjhp Avatar asked Dec 26 '22 18:12

davidjhp


1 Answers

I recommend you avoid multiple grain fact tables where possible.

Since Invoice Fact contains Total Shipping and Total Tax, to boil this down to Invoice Detail Fact, there are two basic options that I can think of:

  1. Create Tax and Freight columns in your Invoice Detail fact and distribute amongst your items. This Kimball Tip suggests exactly that: http://www.kimballgroup.com/2001/07/01/design-tip-25-designing-dimensional-models-for-parent-child-applications/.

  2. An alternative approach which has worked well for me is to create two new members in your product dimension. One for tax and one for freight. Then add these two line items to the fact just like a normal product with appropriate values.

When you analyse by Invoice ID you get the total including Tax & Freight. When you analyse by individual product you don't get a misleading Freight or Tax figure.

like image 146
Nick.McDermaid Avatar answered Jan 10 '23 19:01

Nick.McDermaid