I was wondering about the Google Analytics database design, how they handle the huge values in hourly basis, even in minutes.
Let's say, they have 100 million users and almost every user has 300 counters at every minute. For one user, 300 counters have 18000 rows in one hour. For one day it is 432K rows and almost 3 million rows.
I have thought that they are not using a relational database, but not sure about it...
Is there any suggestion about it?
Regards,
BigTable
And you're right, they are not using a relational database.
High Scalability has a summary of Google's architecture here. It doesn't discuss Analytics directly but it shows how BigTable fits into the entire infratructure. I'm not sure the details of Google's schema is available - as the article says "Infrastructure can be a competitive advantage" - but I would guess it's a lot more tightly bound to the hardware implementation than a regular data model would be.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With