Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl - Code Enhancement

Tags:

perl

I've just started coding in Perl and am just looking to find out if the code below can be made more efficient or can be done in fewer lines.

I've researched a bit into the Win32::OLE module and the Text::CSV module, but this seemed the way to go from what I read so far.

This question is basically a newbie asking an elder: "Hey, how do I become a better Perl progammer?"

The purpose of the code is to obtain data from specified ranges in specified sheets of an Excel workbook and write the contents of those ranges to CSV files.

Also, I know I need to implement general checks, like making sure the my $cellValue is defined before adding it to the array and such, but I am looking more for overall structure. Like is there a way to flatten the looping by putting all the whole row into an array at once, or the whole range in an array or reference, or something of that nature?

Thanks

use strict;
use warnings;
use Spreadsheet::XLSX;

my $excel = Spreadsheet::XLSX -> new ('C:\scott.xlsm',);
my @sheets = qw(Fund_Data GL_Data);

foreach my $sheet (@sheets) {

    my $worksheet = $excel->Worksheet($sheet);
    my $cell = $worksheet->get_cell(25,0);

    if ($cell) { # make sure cell value isn't blank
        my $myFile = "C:/$sheet.csv";
        open NEWFILE, ">$myFile" or die $!;

        # write all cells from Range("A25:[MaxColumn][MaxRow]") to a csv file
        my $maxCol = $worksheet->{MaxCol};
        my $maxRow = $worksheet->{MaxRow};
        my @arrRows;
        my $rowString;

        # loop through each row and column in defined range and string together each row and write to file
        foreach my $row (24 .. $maxRow) {

            foreach my $col (0 .. $maxCol) {

                my $cellValue = $worksheet->{Cells} [$row] [$col]->Value();

                if ($rowString) {
                    $rowString = $rowString . "," . $cellValue;
                } else {
                    $rowString = $cellValue;
                }
            }

            print NEWFILE "$rowString\n";
            undef $rowString;
        }
    }
}
like image 732
Scott Holtzman Avatar asked May 24 '12 20:05

Scott Holtzman


1 Answers

Mark's suggestion is an excellent one. Another minor improvement would be to replace "Do a bunch of nested logic if $cell, with "don't do anything unless $cell - this way you have slightly more readable code (remove 1 extra indentation/nested block; AND don't have to worry what happens if $cell is empty.

# OLD
foreach my $sheet (@sheets) {
    my $worksheet = $excel->Worksheet($sheet);
    my $cell = $worksheet->get_cell(25,0);

    if ($cell) { # make sure cell value isn't blank
        # All your logic in the if
    }
}

# NEW
foreach my $sheet (@sheets) {
    my $worksheet = $excel->Worksheet($sheet);
    next unless $worksheet->get_cell(25,0); # You don't use $cell, so dropped

    # All your logic that used to be in the if
}

As you noted, Text::CSV would be a good thing to consider, depending on whether your data ever needs to be quoted based on CSV standard (e.g. contains spaces, commas, quotes etc...). If it may need to be quoted, don't re-invent the wheel, and use Text::CSV for printing instead. Untested example would be something like this:

# At the start of the script:
use Text::CSV;
my $csv = Text::CSV->new ( { } ); # Add error handler!

    # In the loop, when the file handle $fh is opened
    foreach my $row (24 .. $maxRow) {
        my $cols = [ map { $worksheet->{Cells}[$row][$_] } 0 .. $maxCol) ];
        my $status = $csv->print ($fh, $cols);
        # Error handling
    }
like image 71
DVK Avatar answered Oct 15 '22 19:10

DVK