I'm relatively new to Perl and I've come across this project that I'm having a bit of a hard time with. The object of the project is to compare two csv files, one of which would contain: $name, $model, $version and the other which would contain: $name2,$disk,$storage in the end the RESULT file will contain that matched lines and put together the information like so: $name, $model, $version, $disk,$storage.
I've managed to do this, but my problem is that when one of the elements in missing the program breaks. When it encounters a line in the file missing an element it stops at that line. How can I fix this problem? any suggestions or a way as to how I can perhaps make it skip that line and continue on?
Here's my code:
open( TESTING, '>testing.csv' ); # Names will be printed to this during testing. only .net ending names should appear
open( MISSING, '>Missing.csv' ); # Lines with missing name feilds will appear here.
#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#my (@array) =<FILE>;
my @hostname; #stores names
#close FILE;
#***** TESTING TO SEE IF ANY OF THE LISTED ITEMS BEGIN WITH A COMMA AND DO NOT HAVE A NAME.
#***** THESE OBJECTS ARE PLACED INTO THE MISSING ARRAY AND THEN PRINTED OUT IN A SEPERATE
#***** FILE.
#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#test
if ( open( FILE, "file.txt" ) ) {
}
else {
die " Cannot open file 1!\n:$!";
}
$count = 0;
$x = 0;
while (<FILE>) {
( $name, $model, $version ) = split(","); #parsing
#print $name;
chomp( $name, $model, $version );
if ( ( $name =~ /^\s*$/ )
&& ( $model =~ /^\s*$/ )
&& ( $version =~ /^\s*$/ ) ) #if all of the fields are blank ( just a blank space)
{
#do nothing at all
}
elsif ( $name =~ /^\s*$/ ) { #if name is a blank
$name =~ s/^\s*/missing/g;
print MISSING "$name,$model,$version\n";
#$hostname[$count]=$name;
#$count++;
}
elsif ( $model =~ /^\s*$/ ) { #if model is blank
$model =~ s/^\s*/missing/g;
print MISSING"$name,$model,$version\n";
}
elsif ( $version =~ /^\s*$/ ) { #if version is blank
$version =~ s/^\s*/missing/g;
print MISSING "$name,$model,$version\n";
}
# Searches for .net to appear in field "$name" if match, it places it into hostname array.
if ( $name =~ /.net/ ) {
$hostname[$count] = $name;
$count++;
}
#searches for a comma in the name feild, puts that into an array and prints the line into the missing file.
#probably won't have to use this, as I've found a better method to test all of the feilds ( $name,$model,$version)
#and put those into the missing file. Hopefully it works.
#foreach $line (@array)
#{
#if($line =~ /^\,+/)
#{
#$line =~s/^\,*/missing,/g;
#$missing[$x]=$line;
#$x++;
#}
#}
}
close FILE;
for my $hostname (@hostname) {
print TESTING $hostname . "\n";
}
#for my $missing(@missing)
#{
# print MISSING $missing;
#}
if ( open( FILE2, "file2.txt" ) ) { #Run this if the open succeeds
#open outfile and print starting header
open( RESULT, '>resultfile.csv' );
print RESULT ("name,Model,version,Disk, storage\n");
}
else {
die " Cannot open file 2!\n:$!";
}
$count = 0;
while ( $hostname[$count] ne "" ) {
while (<FILE>) {
( $name, $model, $version ) = split(","); #parsing
#print $name,"\n";
if ( $name eq $hostname[$count] ) # I think this is the problem area.
{
print $name, "\n", $hostname[$count], "\n";
#print RESULT"$name,$model,$version,";
#open (FILE2,'C:\Users\hp-laptop\Desktop\file2.txt');
#test
if ( open( FILE2, "file2.txt" ) ) {
}
else {
die " Cannot open file 2!\n:$!";
}
while (<FILE2>) {
chomp;
( $name2, $Dcount, $vname ) = split(","); #parsing
if ( $name eq $name2 ) {
chomp($version);
print RESULT"$name,$model,$version,$Dcount,$vname\n";
}
}
}
$count++;
}
#open (FILE,'C:\Users\hp-laptop\Desktop\file.txt');
#test
if ( open( FILE, "file.txt" ) ) {
}
else {
die " Cannot open file 1!\n:$!";
}
}
close FILE;
close RESULT;
close FILE2;
I think you want next, which lets you finish the current iteration immediately and start the next one:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( $name && $model && $version );
...;
}
The condition that you use depends on what values you'll accept. In my examples, I'm assuming that all values need to true. If they need to just not be the empty string, maybe you check the length instead:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( length($name) && length($model) && length($version) );
...;
}
If you know how to validate each field, you might have subroutines for those:
while (<FILE>) {
( $name, $model, $version ) = split(",");
next unless( length($name) && is_valid_model($model) && length($version) );
...;
}
sub is_valid_model { ... }
Now you just need to decide how to integrate that into what you are already doing.
You should start by adding use strict
and use warnings
to the top of your program, and declaring all variables with my
at their point of first use. That will reveal a lot of simple mistakes that are otherwise difficult to spot.
You should also use the three-parameter for of open
and lexical filehandles, and the Perl idiom for checking exceptions on opening files is to add or die
to an open
call. if
statements with an empty block for the success path waste space and become unreadable. An open
call should look like this
open my $fh, '>', 'myfile' or die "Unable to open file: $!";
Finally, it is much safer to use a Perl module when you are handling CSV files as there are a lot of pitfalls in using a simple split /,/
. The Text::CSV
module has done all the work for you and is available on CPAN.
You problem is that, having read to the end of the first file, you don't rewind or reopen it before reading from the same handle again in the second nested loop. That means no more data will be read from that file and the program will behave as if it is empty.
It is a bad strategy to read through the same file hundreds of times just to pair up coresponding records. If file is of a reasonable size you should build a data structure in memory to hold the information. A Perl hash is ideal as it allows you to look up the data corresponding to a given name instantly.
I have written a revision of your code that demonstrates these points. It would be awkward for me to test the code as I have no sample data, but if you continue to have problems please let us know.
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new;
my %data;
# Read the name, model and version from the first file. Write any records
# that don't have the full three fields to the "MISSING" file
#
open my $f1, '<', 'file.txt' or die qq(Cannot open file 1: $!);
open my $missing, '>', 'Missing.csv'
or die qq(Unable to open "MISSING" file for output: $!);
# Lines with missing name fields will appear here.
while ( my $line = csv->getline($f1) ) {
my $name = $line->[0];
if (grep $_, @$line < 3) {
$csv->print($missing, $line);
}
else {
$data{$name} = $line if $name =~ /\.net$/i;
}
}
close $missing;
# Put a list of .net names found into the testing file
#
open my $testing, '>', 'testing.csv'
or die qq(Unable to open "TESTING" file for output: $!);
# Names will be printed to this during testing. Only ".net" ending names should appear
print $testing "$_\n" for sort keys %data;
close $testing;
# Read the name, disk and storage from the second file and check that the line
# contains all three fields. Remove the name field from the start and append
# to the data record with the matching name if it exists.
#
open my $f2, '<', 'file2.txt' or die qq(Cannot open file 2: $!);
while ( my $line = $csv->getline($f2) ) {
next unless grep $_, @$line >= 3;
my $name = shift @$line;
next unless $name =~ /\.net$/i;
my $record = $data{$name};
push @$record, @$line if $record;
}
# Print the completed hash. Send each record to the result output if it
# has the required five fields
#
open my $result, '>', 'resultfile.csv' or die qq(Cannot open results file: $!);
$csv->print($result, qw( name Model version Disk storage ));
for my $name (sort keys %data) {
my $line = $data{$name};
if (grep $_, @$line >= 5) {
$csv->print($result, $data{$name});
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With