Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to 'print' to file in perl?

Tags:

perl

buffer

I've been writing output from perl scripts to files for some time using code as below:

open( OUTPUT, ">:utf8", $output_file ) or die "Can't write new file: $!";

print OUTPUT "First line I want printed\n";
print OUTPUT "Another line I want printing\n";

close(OUTPUT);

This works, and is faster than my initial approach which used "say" instead of print (Thank you NYTProf for enlightening my to that!)

However, my current script is looping over hundreds of thousands of lines and is taking many hours to run using this method and NYTProf is pointing the finger at my thousands of 'print' commands. So, the question is... Is there a faster way of doing this?

Other Info that's possibly relevant... Perl Version: 5.14.2 (On Ubuntu)

Background of the script in question... A number of '|' delimited flat files are being read into hashes, each file has some sort of primary key matching entries from one to another. I'm manipulating this data and them combining them into one file for import into another system.

The output file is around 3 Million lines, and the program starts to noticeably slow down after writing around 30,000 lines to said file. (A little reading around seemed to point towards running out of write buffer in other languages but I couldn't find anything about this with regard to perl?)

EDIT: I've now tried adding the line below, just after the open() statement, to disable print buffering, but the program still slows around the 30,000th line.

OUTPUT->autoflush(1);
like image 224
Ashimema Avatar asked Mar 10 '12 20:03

Ashimema


People also ask

How do I print to a file in Perl?

open(WF,'>','/home/user/Desktop/write1. txt'; $text = "I am writing to this file"; print WF $text; close(WF); print "Done!\ n"; perl.

How do I redirect output to a file in Perl script?

Terminal redirects Before you launch your favourite text editor and start hacking Perl code, you may just need to redirect the program output in the terminal. On UNIX-based systems you can write to a file using “>” and append to a file using “>>”. Both write and append will create the file if it doesn't exist.

How do I print multiple lines in Perl?

Multiline String using Single & Double QuotesUser can create a multiline string using the single(”) quotes and as well as with double quotes(“”).

How do you create and write to a file in Perl?

Syntax To Open File in write mode To write a file in Perl it is important to open a file. Filename state that open specified file for write mode to write any content into the file. Print() function is very important in Perl to write content into the file. We have written content into the file using a print function.


2 Answers

I think you need to redesign the algorithm your program uses. File output speed isn't influenced by the amount of data that has been output, and it is far more likely that your program is reading and processing data but not releasing it.

  • Check the amount of memory used by your process to see if it increases inexorably

  • Beware of for (<$filehandle>) loops, which read whole files into memory at once

  • As I said in my comment, disable the relevant print statements to see how performance changes

like image 156
Borodin Avatar answered Oct 31 '22 07:10

Borodin


Have you tried to concat all the single print's into a single scalar and then print scalar all at once? I have a script that outputs an average of 20 lines of text for each input line. When using individual print statements, even sending the output to /dev/null, took a long time. But when I packed all the output (for a single input line) together, using things like:

$output .= "...";

$output .= sprintf("%s...", $var);

Then just before leaving the line processing sub-routine, I 'print $output'. Printing all the lines at once. The number of calls to print went from ~7.7M to about 386K - equal to the number of lines in the input date file. This shaved about 10% off of my total execution time.

like image 40
JimB Avatar answered Oct 31 '22 09:10

JimB