Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Node.js archiver Need syntax for excluding file types via glob

Using archiver.js (for Node.js), I need to exclude images from a recursive (multi-subdir) archive. Here is my code:

const zip = archiver('zip', { zlib: { level: 9 } });
const output = await fs.createWriteStream(`backup/${fileName}.zip`);
res.setHeader('Content-disposition', `attachment; filename=${fileName}.zip`);
res.setHeader('Content-type', 'application/download');
output.on('close', function () {
  res.download(`backup/${fileName}.zip`, `${fileName}.zip`);
});
output.on('end', function () {
  res.download(`backup/${fileName}.zip`, `${fileName}.zip`);
});
zip.pipe(output);
zip.glob('**/*',
  {
    cwd: 'user_uploads',
    ignore: ['*.jpg', '*.png', '*.webp', '*.bmp'],
  },
  {});
zip.finalize();

The problem is that it did not exclude the ignore files. How can I correct the syntax?

like image 642
crashwap Avatar asked Nov 06 '22 02:11

crashwap


1 Answers

Archiver uses Readdir-Glob for globbing which uses minimatch to match.

The matching in Readdir-Glob (node-readdir-glob/index.js#L147) is done against the full filename including the pathname and it does not allow us to apply the option matchBase which will much just the basename of the full path.

In order for to make it work you have 2 options:


1. Make your glob to exclude the file extensions

You can just convert your glob expression to exclude all the file extensions you don't want to be in your archive file using the glob negation !(...) and it will include everything except what matches the negation expression:

zip.glob(
  '**/!(*.jpg|*.png|*.webp|*.bmp)',
  {
    cwd: 'user_uploads',
  },
  {}
);

2. Make minimatch to work with full file pathname

To make minimatch to work without us being able to set the matchBase option, we have to include the matching directory glob for it to work:

zip.glob(
  '**/*',
  {
    cwd: 'user_uploads',
    ignore: ['**/*.jpg', '**/*.png', '**/*.webp', '**/*.bmp'],
  },
  {}
);

Behaviour

This behaviour of Readdir-Glob is a bit confusing regarding the ignore option:

Options

ignore: Glob pattern or Array of Glob patterns to exclude matches. If a file or a folder matches at least one of the provided patterns, it's not returned. It doesn't prevent files from folder content to be returned.

This means that igrore items have to be actual glob expressions that must include the whole path/file expression. When we specify *.jpg, it will match files only in the root directory and not the subdirectories. If we want to exclude JPG files deep into the directory tree, we have to do it using the include all directories pattern in addition with the file extension pattern which is **/*.jpg.

Exclude only in subdirectories

If you want to exclude some file extensions only inside specific subdirectories, you can add the subdirectory into the path with a negation pattern like this:

// The glob pattern '**/!(Subdir)/*.jpg' will exclude all JPG files,
// that are inside any 'Subdir/' subdirectory.

zip.glob(
  '**/*',
  {
    cwd: 'user_uploads',
    ignore: ['**/!(Subdir)/*.jpg'],
  },
  {}
);
like image 159
Christos Lytras Avatar answered Nov 11 '22 04:11

Christos Lytras