Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete entire column in Excel sheet and write updated data in new excel file using Perl?

Tags:

perl

I am new to Perl. I have excel file say "sample.xls" which looks like follows. Sample.xls

There are about data of 1000 rows like this. I want to parse this file and write it in another file say "output.xls" with following output format.

output.xls I have written a script in perl, however, it doesn't give me the exact output the way I want. Also, looks like the script is not very efficient. Can anyone guide me how I can improve my script as well as have my output as shown in "output.xls" ??

Here's the Script:

#!/usr/bin/perl –w

use strict;
use warnings;
use Spreadsheet::ParseExcel;
use Spreadsheet::WriteExcel;
use Spreadsheet::WriteExcel::Chart;


# Read the input and output filenames.
my $inputfile  = "path/sample.xls";
my $outputfile = "path/output.xls";

if ( !$inputfile || !$outputfile ) {
    die( "Couldn't find file\n" );
}

my $parser      = Spreadsheet::ParseExcel->new();
my $inwb = $parser->parse( $inputfile );

if ( !defined $inwb ) {
    die "Parsing error: ", $parser->error(), ".\n";
}

my $outwb = Spreadsheet::WriteExcel->new( $outputfile );


my $inws  = $inwb->worksheet( "Sheet1" );
my $outws = $outwb->add_worksheet("Sheet1");
my $out_row       = 0;

my ( $row_min, $row_max ) = $inws->row_range();
my ( $col_min, $col_max ) = $inws->col_range();

my $format = $outwb->add_format(
center_across => 1,
bold => 1,
size => 10,
border => 4,
color => 'black',
border_color => 'black',
align => 'vcenter',
);


$outws->write(0,0, "Item Name", $format);
$outws->write(0,1, "Spec", $format);
$outws->write(0,2, "First name", $format);
$outws->write(0,3, "Middle Name", $format);
$outws->write(0,4, "Last Name", $format);
$outws->write(0,5, "Customer Number", $format);
$outws->write(0,6, "Age", $format);
$outws->write(0,7, "Units", $format);

my $col_count = 1;
#$row_min = 1;
for my $inws ( $inwb->worksheets() ) {
    my ( $row_min, $row_max ) = $inws->row_range();
    my ( $col_min, $col_max ) = $inws->col_range();

    for my $in_row ( 2 .. $row_max ) {

        for my $col (  0 .. 0 ) {


            my $cell = $inws->get_cell( $in_row, $col);

            my @fields = split /_/, $cell->value();
                next unless $cell;


            $outws->write($in_row,$col, $cell->value());
            $outws->write($in_row,$col+1, $fields[1]);
        }
    }   

    for my $in_row ( 2 .. $row_max ) {

        for my $col (  1 .. 1 ) {

            my $cell = $inws->get_cell( $in_row, $col);

            my @fields = split /_/, $cell->value();
                next unless $cell;


            #$outws->write($in_row,$col+1, $cell->value());
            $outws->write($in_row,$col+1, $fields[0]);
            $outws->write($in_row,$col+2, $fields[1]);
            $outws->write($in_row,$col+3, $fields[2]);
            $outws->write($in_row,$col+4, $fields[3]);
        }
    }   

    for my $in_row ( 2 .. $row_max ) {

        for my $col (  2 .. 2 ) {

            my $cell = $inws->get_cell( $in_row, $col);

            my @fields = split /_/, $cell->value();
                next unless $cell;


            $outws->write($in_row,6, $cell->value());
        }
    }   

    for my $in_row ( 2 .. $row_max ) {

        for my $col (  3 .. 9 ) {

            my $cell = $inws->get_cell( $in_row, $col);

            next unless $cell;


        }
    }   

    for my $in_row ( 2 .. $row_max ) {
        for my $col ( 10 .. 10 ) {

            my $cell = $inws->get_cell( $in_row, $col );

            next unless $cell;


            $outws->write($in_row,7, $cell->value());

        }
    }

}
like image 285
Joe the techie Avatar asked Jun 04 '16 02:06

Joe the techie


1 Answers

To get your output sorted, you need to collect all the information first before you are writing it out. Right now, you are doing a bit of jumping back and forth between rows and columns.

Here are some changes I would make to get it sorted, and make it more efficient (to read).

  • Create a data structure $data outside of your loop to store all the information.
  • If there is only one worksheet, you don't need to loop over sheets. Just work with one sheet.
  • Loop over the lines.
  • Inside that loop, use the code you have to parse the individual fields to just parse them. No 2..2 loops. Just a bunch of statements.

    my @item_fields = split /_/, $inws->get_cell( $in_row, 0 ) || q{};
    my @name_fields = split /_/, $inws->get_cell( $in_row, $col ) || q{};
    
  • Store them in $data per item.

    push @{ $data } = [ $item_fields[0], ... ];
    
  • Done with the loop. Open the output file.

  • Loop over $data with a sort and write to the output file.

    foreach my $row (sort { $a->[0] cmp $b->[0] } @{ $data } ) { ... }
    
  • Done.

I suggest you read up on sort and also check out perlref and perlreftut to learn more about references (data structures).

like image 104
simbabque Avatar answered Sep 28 '22 15:09

simbabque