Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-Stop cronjob

Tags:

I got a new requirement from the business about "real time" data/statistics. They want to show how our system is performing in real time.

I am not sure how to do it but here are my thoughts:

I don't think it's possible to get the data every seconds because cronjob run at least every minute. So, without telling them, I said YES it's possible.

Now my problem is this how can I run a cronjob that get statistics on our sites (sales, impressions, cpc etc...)?

Example:

From 9h01 AM to 9h02 AM i have:

  • 41 views on product 1
  • 1 order
  • 8 referral click from clients
  • 2 added to wish list

From 9h02 AM to 9h03 AM i have:

  • 57 views on product 1
  • 0 order
  • 13 referral click from clients
  • 0 added to wish list

Total:

  • 98 views on product 1
  • 1 order
  • 21 referral click from clients
  • 2 added to wish list

How can i make sure I won't calculate duplicates if for some reason the database is slow and does not process information on time?

Thanks

Edit: The company has 200 employee in 3 different states which consists of sales, business analyst, tech, accounting, and executive staff and these people can read these reports.

In the last year we hired 20 employees so it will grow a little. For traffic data, it is hard to tell exactly how much data we get per minutes. The estimate is approx 2.5k to 10k per minute.

We just ordered 3 PowerEdge R510 (Intel® Xeon® E5503, 2.0Ghz, 4M Cache, 12GB Memory (3x4GB), 1333MHz Dual Rank, 4 x 300GB 15K RPM Serial-Attach SCSI 6Gbps RAID 5).

like image 727
Gino Sullivan Avatar asked Nov 29 '11 11:11

Gino Sullivan


People also ask

What is * * * * * In cron job?

What does * mean in Cron? The asterisk * is used as a wildcard in Cron. * sets the execution of a task to any minute, hour, day, weekday, or month.

How do I stop all Cronjobs?

You can stop a single cron job by removing its line from the crontab file. To do that, run the crontab -e command and then delete the line for the specific task. Alternatively, you can stop the cron job by commenting it out in the crontab file.

How do I run a cron job every 5 minutes?

basic 3. /usr/bin/vim. tiny 4. /bin/ed Choose 1-4 [1]: Make a new line at the bottom of this file and insert the following code. Of course, replace our example script with the command or script you wish to execute, but keep the */5 * * * * part as that is what tells cron to execute our job every 5 minutes.


Video Answer


1 Answers

Here's what I recommend based on your servers/employee/data if these servers are. Because you are using 1 server (and 1 backup), the capacity of your drive should be enough for a while unless you want to archive complete data on this server. Data can grow rapidly and I would think to increase the capacity or archive the data somewhere else.

Now, because you have a lot of people that can request reporting data the main idea is to retrieve data as fast as possible to make sure you don't lock records (specially if you using myisam tables - table locking vs innodb which has row level locking).

Use your index (unique if you need) wisely and store your data as efficiently as possible using timestamp.

What you can also do is to summarize your data which can simplify your queries. Although, is not a common practice in databases since it does not respect the normal forms. You can get great performance but it's a pain to maintain.

To be honest, a cron that runs every minutes is fine since you have the time when you save the record but it is possible to get data every second. I recommend to make sure when you get a record, you mark this record as "processed" or some other status in order you don't take this record twice.

Now when you summarize your data make sure you optimize your queries and you can check also what the explain will output and then make a decision.


EDIT: Summarizing data (which does not respect the database normalization) will get you great performance since you only query records without using aggregate functions and having joins tables using minimal where clause.

Example:

98 views on product 1 1 order 21 referral click from clients 2 added to wishlist 

can be:

SELECT  views, orders, referral, whishlist FROM  summarize_stats_20111201 /* daily table for example */ WHERE  `time` between 1322791200 /*2011-12-01 21:00:00*/ AND 1322791260 /*2011-12-01 21:01:00*/; 

views has the total amount of views, in this example 98

orders has the total amount of orders, in this example 1

referral has the total amount of referral, in this example 21

wishlist has the total amount of wishlist, in this example 2

These are calculated data in a summary table (this is why i said "does not respect the database normalization" because you never calculate data in a RDBMS) but if you need data instantly, this is a way you can do it.


EDIT 2: Here's a example of maintaining this solution:

You have a cronjob that maintains tables. His job is to create the table for the next day or what ever you need.

// in php $date = date('Ymd', strtotime('+1 day')); // for daily table  $sql = 'CREATE TABLE IF NOT EXISTS the_database.summarize_stats_" . $date . "; 

So when you inserts, make sure you have the right table name and you use ON DUPLICATE KEY

// in php $sql = 'INSERT INTO TABLE summarize_stats_20111201 SET /* all the fields you need */ ON DUPLICATE KEY views = views + 1; 

for example if you want to increase the view

What I also forget is if you need to query 1 week of data, you will have to create a merge table. This way you can do something like:

SELECT  views, orders, referral, whishlist FROM  summarize_stats_2011 /* yearly merge table for example */ WHERE  `time` between 1322272800 /*2011-11-25 21:00:00*/ AND 1322791260 /*2011-12-01 21:01:00*/; 

This way you don't have to UNION ALL tons of queries.

like image 90
Book Of Zeus Avatar answered Oct 20 '22 20:10

Book Of Zeus