Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl script slowing down as it progresses

I have written a Perl script that compiles displays so that they can be viewed by a user. There are thousands of these display files (DSET files) that need to be compiled and the process takes quite a long time (4-5 hours). The displays are compiled using an external executable file (I have no details about the inner workings of this executable).

As a solution to speed up the process we decided to run multiple instances of this executable in parallel in an attempt to drastically improve performance.

While running with 16 threads the performance is increased significantly and it now takes about 1 hour to complete rather than 4-5, but there is still an issue. As the script progresses, the time this executable takes to run is increasing.

I did a test with about 1000 DSET files and monitored the execution time of the external compiling program as the Perl script progressed. Below is a plot of the execution time increasing over time.

performance plot

As you can see, when the script starts it takes the Perl script about 4 seconds to open the executable, compile a DSET and then close the executable. Once the script has processed about 500 DSETs, the time it takes to compile each subsequent DSET appears to increase. By the time the script is nearing its end, some of the DSET files take as much as 12 seconds to compile!

Here is an example of the function each thread performs:

# Build the displays
sub fgbuilder {
    my ($tmp_ddldir, $outdir, $eset_files, $image_files) = @_;

    # Get environment variables
    my $executable = $ENV{fgbuilder_executable};
    my $version    = $ENV{fgbuilder_version   };

    # Create the necessary directories
    my $tmp_imagedir = "$tmp_ddldir\\images";
    my $tmp_outdir   = "$tmp_ddldir\\compiled";
    make_path($tmp_ddldir, $tmp_imagedir, $tmp_outdir);

    # Copy the necessary files
    map { copy($_, $tmp_ddldir  ) } @{$eset_files };
    map { copy($_, $tmp_imagedir) } @{$image_files};

    # Take the next DSET off of the queue
    while (my $dset_file = $QUEUE->dequeue()) {

        # Copy the DSET to the thread's ddldir
        copy($dset_file, $tmp_ddldir);

        # Get the DSET name
        my $dset          = basename($dset_file);
        my $tmp_dset_file = "$tmp_ddldir\\$dset";

        # Build the displays in the DSET
        my $start = time;
        system $executable,
            '-compile' ,
            '-dset'    , $dset        ,
            '-ddldir'  , $tmp_ddldir  ,
            '-imagedir', $tmp_imagedir,
            '-outdir'  , $tmp_outdir  ,
            '-version' , $version     ;
        my $end = time;
        my $elapsed = $end - $start;

        $SEMAPHORE->down();
        open my $fh, '>>', "$ENV{fgbuilder_errordir}\\test.csv";
        print {$fh} "$PROGRESS,$elapsed\n";
        close $fh;
        $SEMAPHORE->up();

        # Remove the temporary DSET file
        unlink $tmp_dset_file;

        # Move all output files to the outdir
        recursive_move($tmp_outdir, $outdir);

        # Update the progress
        { lock $PROGRESS; $PROGRESS++; }
        my $percent = $PROGRESS/$QUEUE_SIZE*100;
        { local $| = 1; printf "\rBuilding displays ... %.2f%%", $percent; }
    }

    return;
}

Every time through the loop it spawns a new instance of the display building executable, waits for it to complete and then closes that instance (which should free up any memory it was using and resolve any issues like the one I am seeing).

There are 16 of these threads running in parallel, each taking a new DSET off of the queue, compiling it and moving the compiled display to the output directory. Once the display is compiled it then proceeds to take another DSET off of the queue and restart the process until the queue is exhausted.

I have been scratching my head for days trying to figure out why it is slowing down. During the process, my RAM usage is stable and is not increasing and my CPU Usage is no where near maxing out. Any help or insight on what may be going on here is appreciated.


EDIT

I wrote a test script to try and test the theory that the problem is caused by a disk I/O caching issue. In this script I took the same basic body of the old script and replaced the call to the executable with my own task.

Here is what I replaced the executable with:

    # Convert the file to hex (multiple times so it takes longer! :D)
    my @hex_lines = ();
    open my $ascii_fh, '<', $tmp_dset_file;
    while (my $line = <$ascii_fh>) {
        my $hex_line = unpack 'H*', $line;
        $hex_line = unpack 'H*', $hex_line;
        $hex_line = unpack 'H*', $hex_line;
        $hex_line = unpack 'H*', $hex_line;
        $hex_line = unpack 'H*', $hex_line;
        $hex_line = unpack 'H*', $hex_line;
        $hex_line = unpack 'H*', $hex_line;
        $hex_line = unpack 'H*', $hex_line;
        push @hex_lines, $hex_line;
    }
    close $ascii_fh;

    # Print to output files
    make_path($tmp_outdir);
    open my $hex_fh, '>', "$tmp_outdir\\$dset" or die "Failed to open file: $!";
    print {$hex_fh} @hex_lines;
    close $hex_fh;
    open $hex_fh, '>', "$tmp_outdir\\2$dset" or die "Failed to open file: $!";
    print {$hex_fh} @hex_lines;
    close $hex_fh;
    open $hex_fh, '>', "$tmp_outdir\\3$dset" or die "Failed to open file: $!";
    print {$hex_fh} @hex_lines;
    close $hex_fh;
    open $hex_fh, '>', "$tmp_outdir\\4$dset" or die "Failed to open file: $!";
    print {$hex_fh} @hex_lines;
    close $hex_fh;
    open $hex_fh, '>', "$tmp_outdir\\5$dset" or die "Failed to open file: $!";
    print {$hex_fh} @hex_lines;
    close $hex_fh;
    open $hex_fh, '>', "$tmp_outdir\\6$dset" or die "Failed to open file: $!";
    print {$hex_fh} @hex_lines;
    close $hex_fh;

Instead of calling the executable and compiling the DSETs I open each one as a text file do some simple processing and write a few files to the disk (I wrote a few files to the disk each time because the executable writes more than one file to the disk for every DSET it processes). I then monitored the processing time and graphed my results.

Here are my results:

Processing time vs script progression

I do believe that part of my issue with the other script is a disk I/O problem, but as you can see here, with the disk I/O issue I purposely created, the increase in processing time is not gradual. It has an abrupt jump and then the results become rather unpredictable.

In my previous script you can see some of the unpredictability, and it is writing a lot of files so I have no doubt that the issue is caused at least in part by a disk I/O problem, but that still does not explain why the increase in processing time is gradual and appears to be at a constant rate.

I believe there is still some other factor at play here that we aren't accounting for.

like image 940
tjwrona1992 Avatar asked Apr 30 '15 14:04

tjwrona1992


1 Answers

I think that you simply have disk fragmentation issue. Given that you have multiple threads constantly creating and deleting new files of varying size, eventually disk space becomes very fragmented. I don't know which operating system you are running this under, I would guess it is Windows.

Reason why you can't reproduce this with your test tool might be because of behavior of your external compiler tool - it probably creates the output file, and then extends its size many times with varying times between writes, which tends to create files that overlap disk space when they run in multiple threads, especially if disk usage is relatively high, like over 70%. You test seems to be serializing file creation, which avoids concurrent writes fragmentation.

Possible solutions:

  1. Defrag disk drive. Simply copying compiled files to another partition/disk, removing them and copying back should be enough.
  2. Run your external compiler on a few different independent partitions to avoid fragmentation.
  3. Make sure your filesystem has 50% or more free space.
  4. Use operating system that is less prone to filesystem fragmentation, e.g. Linux.
like image 133
mvp Avatar answered Sep 30 '22 14:09

mvp