Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distributing payload to multiple cron jobs

Tags:

linux

shell

cron

I have a shell script say data.sh. For this script to execute I will pass a single argument say Table_1.

I have a test file which I will get as a result of a different script.

Now in a test file I have more than 1000 arguments to pass to the script.

The file looks like below:

Table_1
Table_2
Table_3
Table_4
and..so..on

Now I want to execute the script to run in parallel.

I am doing this using cron job.

First I am splitting the test file into 20 parts Using the split command in Linux.

 split -l $(($(wc -l < test )/20 + 1)) test

I will then have the test file divided to 20 parts such as xaa,xab,xac and so on.

Then run the cron job:

* * * * * while IFS=',' read a;do /home/XXXX/data.sh $a;done < /home/xxxx/xaa
* * * * * while IFS=',' read a;do /home/XXXX/data.sh $a;done < /home/xxxx/xab
and so on.

As this involves lot of manual process. I would like to do this dynamically.

Here is what I want to achieve:

1) As soon as I get the test file I would like it to be split into say 20 files automatically and store at a particular place.

2) Then I would like to schedule the cron job for every day 5 Am by passing the 20 files as arguments to the script.

What is the best way to implement this? Any answers with explanation will be appreciated.

like image 988
User12345 Avatar asked Jan 19 '26 18:01

User12345


1 Answers

Here is what you could do. Create two cron jobs:

  1. file_splitter.sh -> splits the file and stores them in a particular directory
  2. file_processer.sh -> picks up one file at a time from the directory above, does a read loop, and calls data.sh. Removes the file after successful processing.

Schedule file_splitter.sh to run ahead of file_processor.sh.

If you want to achieve further parallelism, you can make file_splitter.sh write the split files into multiple directories with a few files in each. Let's say they are called sub1, sub2, etc. Then, you can schedule multiple instances of file_processor.sh and pass the sub directory name as an argument. Since the split files are stored in separate directories, we can ensure that only one job processes the files in a particular subdirectory.

It's better to keep the cron command as simple as possible.

* * * * * /path/to/file_processor.sh

is better than

* * * * * while IFS=',' read a;do /home/XXXX/data.sh $a;done < /home/xxxx/xab

Makes sense?

I had written a post about how to manage cron jobs effectively. You may want to take a look at it:

Managing log files created by cron jobs

like image 170
codeforester Avatar answered Jan 22 '26 14:01

codeforester



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!