File::Slurp faster to write a file to perl

Tags:

I have a perl script where I am writing out a very large log file. Currently I write out my file in the 'traditional' Perl way of doing it:

open FILE, ">", 'log.txt';
print FILE $line;
.....
close FILE;

I've heard a lot of good things about File::Slurp when reading in files, and how it can vastly improve runtimes. My question is, would using File::Slurp make writing out my log file any faster? I ask because writing out a file in perl seems pretty simple as it is, I don't know how File::Slurp could really optimize it anymore.

848

asked Sep 03 '12 05:09

srchulo

1 Answers

The File::Slurp utilities may, under certain circumstances, be fractionally faster overall than the equivalent streamed implementation, but file I/O is so very much slower than anything based solely on memory and CPU speed that it is almost always the limiting resource.

I have never heard any claims that File::Slurp can vastly improve runtimes and would appreciate seeing a reference to that effect. The only way I could see it being a more efficient solution is if the program requires random access to the files or has to read it multiple times. Because the data is all in memory at once there is no overhead to accessing any of the data, but in this case my preference would be for Tie::File which makes it appear as if the data is all available simultaneously with little speed impact and far less memory overhead

In fact it may well be that a call to read_file makes the process seem much slower to the user. If the file is significantly large then the time taken to read all of it and split it into lines may amount to a distinct delay before processing can start, whereas openeing a file and reading the first line will usually appear to be instantaneous

The same applies at the end of the program. A call to write_file, which combines the data into disk blocks and pages it out to the file, will take substantially longer than simply closing the file

In general the traditional streaming output method is preferable. It has little or no speed impact and avoids data loss by saving the data incrementally instead of waiting until a vast swathe of data has been accumulated in memory before discovering that it cannot be written to disk for one reason or another

My advice is that you reserve using File::Slurp for when you have small files to which random access could significantly simplify the program code. Even then there is nothing wrong with

my @data = do {
  open my $fh, '<', 'my_file' or die $!;
  <$fh>;
};

for input, or

open my $fh, '>', 'out_file' or die $!;
print { $fh } for @data;

for output. Particularly in your case, where you are dealing with a very large log file I think there is no question that you should stick to streamed output methods

answered Sep 28 '22 02:09

Borodin

Related questions
                            
                                Rapidly generating ~ 10^9 steps of a random process in R
                            
                                What's the fastest way to reinitialize a vector?
                            
                                How can I find the performance bottlenecks in my Ruby application?
                            
                                What's the most efficient way to check the presence of a row in a table?
                            
                                What's the fastest way to select a large amount of checkboxes and de/select them?
                            
                                What are the best garbage collection settings for client side?
                            
                                Tips for making a asp.net web application run faster
                            
                                Core Data pattern: how to efficiently update local info with changes from network?
                            
                                Divide and conquer of large objects for GC performance
                            
                                Linq to SQL - what's better?
                            
                                Is Linq Faster, Slower or the same?
                            
                                Tips for making a fraction calculator code more optimized (faster and using less memory)
                            
                                Why is the speed difference between these 2 functions so large?
                            
                                @EJB injection vs lookup - performance issue
                            
                                Accurately measure performance of stored procedure
                            
                                Are queries against Azure Table Storage indexed when using a partial RowKey?
                            
                                how to best achieve string to number mapping in a c program
                            
                                Seeking further understanding on Iterators in java
                            
                                Any performance gain/loss with having several function calls rather than a single large one?
                            
                                Disadvantages of using large variables/arrays on the stack?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

File::Slurp faster to write a file to perl

Tags:

performance

file-io

perl

fileslurp

srchulo

People also ask

1 Answers

Borodin

Recent Activity

Donate For Us