Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance star schema (BI)

We have different meassures per company, country and product:

Company A, Australia, Beer, 10, 22, 3, 9

Common use case is to calculate the sum of measure 1 for beer in a specific country.

I have two options in mind:

Option A:

Dimension-Table:

ID
Company
Country
Product

Measure-Table:

Dimension.ID (FK)
DateTime
Measure1
Measure2
Measure3
Measure4

Or Option B:

All-In-One-Table:

ID
Company
Country
Product
DateTime
Measure1
Measure2
Measure3
Measure4

In the dimension-table are 10,000 unique records and in the Measure-table nearly 30,000,000 records.

At the moment we use alphanumeric key for product, company and country.

My question:

Which performs better?

Any suggestions for indexing the columns?

like image 243
LuckyStrike Avatar asked May 28 '26 16:05

LuckyStrike


1 Answers

If you assume that average size of one row in "dimension" table is 50 bytes (just guessing) and in the "measure table" also 50 bytes then in Option A you would have database of size not less than 1.5 GB, and in Option B not less than 3 GB. This is a significant difference.

So I would say that Option A is better because the database size is smaller.

Indexing depends on queries you want to use. But I think that when using Option A no indexes (apart from those created by primary and foreign keys) are needed because the "dimension" table is small.

like image 158
Grzegorz Avatar answered May 31 '26 05:05

Grzegorz



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!