I have a fairly large CSV file, with 4 Million records with 375 fields, that needs to be processed. I'm using the RUBY CSV library to read this file and it is very slow. I thought PHP CSV file processing was slow but comparing the two reads PHP is is more then 100 times faster. I'm not sure if I'm doing something dumb or this is just the reality of RUBY not being optimized for this type of batch processing. I set up simple test pgms to get comparative times in both RUBY and PHP. All I do is read, no writing, no building of big arrays, and break out of the CSV read loops after processing 50,000 records. Has anyone else experienced this performance issue?
I'm running locally on a MAC with 4gig of memory, running OS X 10.6.8 and Ruby 1.8.7.
The Ruby process takes 497 seconds to simply read 50,000 records, the PHP process runs in 4 seconds which is not a typo, it's more then 100 times faster. FYI - I had code in the loops to print out data values to make sure that each of the processes was actually reading the files and bringing data back.
This is the Ruby Code:
require('time')
require('csv')
x=0
t1=Time.new
CSV.foreach(pathfile) do |row|
x += 1
if x > 50000 then break end
end
t2 = Time.new
puts " Time to read the file was #{t2-t1} seconds"
Here is the PHP code:
$t1=time();
$fpiData = fopen($pathdile,'r') or die("can not open input file ");
$seqno=0;
while($inrec = fgetcsv($fpiData,0,',','"')) {
if ($seqno > 50000) break;
$seqno++;
}
fclose($fpiData) or die("can not close input data file");
$t2=time();
$t3=$t2-$t1;
echo "Start time is $t1 - end time is $t2 - Time to Process was " . $t3 . "\n";
You'll likely get a massive speed boost by simply updating to a current version of Ruby. in Version 1.9, FasterCSV was integrated as Ruby's standard CSV library.
Check out Chruby to manage your different Ruby versions.
Check out the smarter_csv Gem, which has special options for handling huge files by reading data in chunks.
It also returns the CSV data as hashes, which can make it easier to insert or update the data in a database.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With