Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting file names

Tags:

regex

perl

I'm writing a script that takes a list of files from a directory, opens each one, and then searches for the lines that contain a filename with the .zip extension. Then I want to strip out just the filename from the line. Here is my code:

foreach (@fnames) {
    chomp ($_);
    open FILE, '<', "$_";
    @archives = grep { /.+?\.zip/ } <FILE>;

    foreach (@archives) {
        if ($_ =~ /("|>)(.+?)("|<)/) { push @files, $2; }
    }
}

The files I'm pulling the data from will contain the .zip filenames between either double quotes or angle brackets. This code is returning nothing, but I know the filenames are there. If I do a grep in the terminal I can see all of them, but the grep in Perl isn't giving me anything. Any ideas?

like image 677
stimko68 Avatar asked Nov 25 '25 14:11

stimko68


1 Answers

Possible things wrong:

  • @fnames is empty, because of some error in code you are not showing.
  • open FILE, ... fails, but since you did not check the return value of the open, it fails silently, hence you don't know about it. Use open ... or die $!
  • You have uppercase letters in your input, e.g. ZIP, and do not use the /i ignore case option in the grep. Btw, .+? in the beginning is fairly useless, unless you expect unwanted strings that begin with .zip (i.e. it only checks that there is at least one character before).
  • The if-statement inside the second loop will only grab the first match.

Also:

  • You should use a lexical filehandle with open.
  • You should use strict and warnings, if you are not already doing so.
  • my @archives and my @files in the proper lexical scope will help assure you get and keep the data you want.
  • $_ =~ /.../ can simply be written /.../ for better readability (IMO).
  • You do not (really) need a transition variable.
  • ("|>) is a redundant way of saying [">].
  • The grep is redundant processing. You can simply do:

while (<FILE>) {
      push @files, /[">](.*\.zip)["<]/ig;
}

In short:

my @files;
foreach my $file (@fnames) {
    chomp $file;
    open my $fh, '<', $file or die $!;
    while (<$fh>) {
        push @files, /[">](.*\.zip)["<]/ig;
    }
}
print "File names found: @files\n";
like image 109
TLP Avatar answered Nov 27 '25 05:11

TLP



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!