The hash that I tested contains around 70000 colleges and each college contains around 20 students. I tried it 5 times and following are the results. There is considerable difference in foreach performance and while (each) performance. Why is that so?
Code with while loop:
while ( my ($college_code, $college_info_hr) = each (%{$college_data_hr}) ) {
while ( my ($student_num, $student_info_hr) = each (%{$college_info_hr->{'students'}}) ) {
if($student_num < 104000) { ## Delete the info of students before 2004.
delete $college_info_hr->{'students'}{$student_num};
}
}
}
Code with foreach loop:
foreach my $college_code (keys %{$college_data_hr}) {
foreach my $student_num (keys %{$college_data_hr->{$college_code}{'students'}}) {
if($student_num < 104000) { ## Delete the info of students before 2004.
delete $college_data_hr->{$college_code}{'students'}{$student_num};
}
}
}
When the number of colleges are 70,000 then the execution times are:
For the code with while loop (Interval time is in seconds):
Interval time: 2.186621
Interval time: 2.058644
Interval time: 2.055645
Interval time: 2.101637
Interval time: 2.124632
For the code with foreach loop: (Interval time is in seconds)
Interval time: 1.341768
Interval time: 1.436751
Interval time: 1.346529
Interval time: 1.302775
Interval time: 1.356765
When the number of colleges are 248,000 then the execution times are:
(execution times for while loop)
Interval time: 9.084427
Interval time: 8.438684
Interval time: 9.329338
Interval time: 9.169687
(execution times for foreach loop)
Interval time: 5.502048
Interval time: 6.386692
Interval time: 5.596032
Interval time: 5.620144
The foreach version only dereferences the $college_data_hr->{$college_code}{'students'} hashref once per college, so is faster than the while version which needs to do it once per student.
The foreach version will likely use more memory though, as it needs to build temporary lists containing the keys for each hash.
Data::Alias might help you speed up the while solution. I've not benchmarked this, but it should be fairly fast...
use Data::Alias;
while ( my ($college_code, $college_info_hr) = each %$college_data_hr ) {
alias ( my %students = %{$college_info_hr->{'students'}} );
while ( my ($student_num, $student_info_hr) = each %students ) {
if ($student_num < 104000) { ## Delete the info of students before 2004.
delete $students{$student_num};
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With