Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL: How to query multiple 'statistics' in a single query?

Tags:

php

mysql

This is more of a question rather than a problem that I need to solve. My backend is fast and my queries running great, so it's not that important. Okay, but let's get to it.

I have 4 panels of statistics on my dashboard regarding the number of views from today, yesterday, this week and this month; each taking up one query in my database. What I was wondering is, how would one put all those queries together to ease up the load on the database/server?

I was looking through Stackoverflow before asking and saw one saying something like:

SUM(case when status = 'open' then 1 else 0 end) as [Open],
SUM(case when status = 'closed' then 1 else 0 end) as [Closed]

Source: Gathering multiple statistics about a table in a single query

Which could be what I need, could be something like:

SUM(case when DATE(created_at) = '2015-07-23' then 1 else 0 end) as today,
SUM(case when DATE(created_at) = '2015-07-22' then 1 else 0 end) as yesterday,
SUM(case when WEEK(created_at) = '29' then 1 else 0 end) as week,
SUM(case when MONTH(created_at) = '7' then 1 else 0 end) as month

I was just wondering if anyone has some better suggestions, as I have applied this to my function and it works just fine.

like image 750
Lasse R Avatar asked Jul 23 '15 20:07

Lasse R


1 Answers

The comments are giving good hints. I'll provide here another idea, that conceptually is used in most of the banks I've been working with.

When there are billions of rows, and you do not need exact instant snapshots for every request (meaning: you have a tolerance for slightly outdated data), it is worth exploring batch processes.

This is how it works:

  1. You define your lag tolerance: for example: "I'm ok with data outdated 8 hours". This is the periodicity of your batch process.
  2. You denormalize the database to add a "redundant" column/table storing running totals of your choice. For example, you would add a table called statistics_snapshot with 4 columns: (timestamp, month, day, week) or something of the like.
  3. You create a stored procedure in mysql that fills up this table with your 4 queries, or with one global queries as you suggested. A timestamp is also registered so you know when it was taken.
  4. You create a user with EXECUTE grant for that procedure, ONLY.

    CREATE USER 'cron_mysql_user'@'localhost' IDENTIFIED BY 'strongpassword';
    GRANT EXECUTE ON PROCEDURE db_name.proc_name TO 'cron_mysql_user'@'localhost';
    
  5. You use DBMS_JOB a cron job to connect mysql and run this procedure with periodicity defined in point #1. Usually you can run scripts from the command line like this:

    mysql --user='cron_mysql_user'@'localhost' -pstrongpassword --execute="call proc_name()" db_name
    
  6. You create nice reports based on your periodic snapshots :-)

The advantage of doing so is that you centralized I/O a few times a day only, a controlled manner, so you only have a very light SELECT statement to do when you need to know the statistics.

like image 111
Sebas Avatar answered Nov 14 '22 21:11

Sebas