I have a need to upgrade a Perl CGI script where the users must complete 3 steps. After they finish each step, the script is logging which step the user completed. Having a record of this is important so we can prove to the user that they only finished step one and didn't complete all three steps, for example.
Right now, the script is creating 1 log file for each instance of the CGI script. So if UserA does step 1, then UserB does step 1, then step 2, then step 3 - and then UserA finishes step 2 and step 3, the order of the log files would be.
LogFile.UserA.Step1
LogFile.UserB.Step1
LogFile.UserB.Step2
LogFile.UserB.Step3
LogFile.UserA.Step2
LogFile.UserA.Step3
The log files are named with the current timestamp, a random number, and the process PID.
This works fine to prevent the same file from getting written to more than once, but the directory quickly gets thousands of files (each file contains just a few bytes in it). There is a process to rotate and compress these logs, but it has fallen upon me to make it so the script logs to just one file a day to reduce the number of log files being created.
Basically, the log file will have the current date in the file name, and anytime the CGI script needs to write to the log, it will append to the one log file for that day, regardless of the user or what step they are on.
Nothing will need to be reading the log file - the only thing that will happen to it is an append by the CGI script. The log rotation will run on log files that are 7 days or older.
My question is, what is the best way to handle the concurrent appends to this log file? Do I need to lock it before appending? I found this page on Perl Monks that seems to indicate that "when multiple processes are writing to the same file, and all of them have the file opened for appending, data shall not be overwritten."
I've learned that just because it can be done doesn't mean that I should, but in this case, what is the safest, best practice way to do this?
Summary:
Thanks!
Yes, use flock
.
An example program is below, beginning with typical front matter:
#! /usr/bin/perl
use warnings;
use strict;
use Fcntl qw/ :flock /;
Then we specify the path to the log and the number of clients that will run:
my $log = "/tmp/my.log";
my $clients = 10;
To log a message, open the file in append mode so all writes automatically go at the end. Then call flock
to wait our turn on having exclusive access to the log. Once we're up, write the message and close
the handle, which automatically releases the lock.
sub log_step {
my($msg) = @_;
open my $fh, ">>", $log or die "$0 [$$]: open: $!";
flock $fh, LOCK_EX or die "$0 [$$]: flock: $!";
print $fh "$msg\n" or die "$0 [$$]: write: $!";
close $fh or warn "$0 [$$]: close: $!";
}
Now fork
off $clients
child processes to go through all three steps with random intervals between:
my %kids;
my $id = "A";
for (1 .. $clients) {
my $pid = fork;
die "$0: fork: $!" unless defined $pid;
if ($pid) {
++$kids{$pid};
print "$0: forked $pid\n";
}
else {
my $user = "User" . $id;
log_step "$user: Step 1";
sleep rand 3;
log_step "$user: Step 2";
sleep rand 3;
log_step "$user: Step 3";
exit 0;
}
++$id;
}
Don't forget to wait on all the children to exit:
print "$0: reaping children...\n";
while (keys %kids) {
my $pid = waitpid -1, 0;
last if $pid == -1;
warn "$0: unexpected kid $pid" unless $kids{$pid};
delete $kids{$pid};
}
warn "$0: still running: ", join(", " => keys %kids), "\n"
if keys %kids;
print "$0: done!\n", `cat $log`;
Sample output:
[...] ./prog.pl: reaping children... ./prog.pl: done! UserA: Step 1 UserB: Step 1 UserC: Step 1 UserC: Step 2 UserC: Step 3 UserD: Step 1 UserE: Step 1 UserF: Step 1 UserG: Step 1 UserH: Step 1 UserI: Step 1 UserJ: Step 1 UserD: Step 2 UserD: Step 3 UserF: Step 2 UserG: Step 2 UserH: Step 2 UserI: Step 2 UserI: Step 3 UserB: Step 2 UserA: Step 2 UserA: Step 3 UserE: Step 2 UserF: Step 3 UserG: Step 3 UserJ: Step 2 UserJ: Step 3 UserE: Step 3 UserH: Step 3 UserB: Step 3
Keep in mind that the order will be different from run to run.
"when multiple processes are writing to the same file, and all of them have the file opened for appending, data shall not be overwritten" may be true, but that doesn't mean your data can't come out mangled (one entry inside another). It's not very likely to happen for small amounts of data, but it might.
flock
is a reliable and reasonably simple solution to that problem. I would advise you to simply use that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With