Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

List all files in a directory given a regular expression / a set of extensions (Matlab)

Tags:

file

matlab

I have a regular expression defining the filenames of interest. What is the best way to list all files in a directory that match this condition?

My attempt at this is:

f = dir(DIR);
f = {f([f.isdir] == 0).name};
result = f(~cellfun(@isempty, regexpi(f, '.*(avi|mp4)')));

However, I wonder if there is a faster and/or cleaner solution to this?

Is is possible to simplify it if instead of a regular expression I have only a list of possible file extensions?

like image 375
John Manak Avatar asked May 25 '13 16:05

John Manak


People also ask

How do I list all files in Matlab?

To search for multiple files, use wildcards in the file name. For example, dir *. txt lists all files with a txt extension in the current folder. To search through folders and subfolders on the path recursively, use wildcards in the path name.

Which part of the Matlab environment shows a list of the files in the current folder type your answer in small caps?

ls name lists the files and folders in the current folder that match the specified name.

What is the difference between () and [] in regex?

In other words, square brackets match exactly one character. (a-z0-9) will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz , the second is one of 0123456789 , just as if the parenthesis weren't there. The () will allow you to read exactly which characters were matched.


1 Answers

Fundamentally your approach is what I would go for. However, your lines of code can be simplified to (directories are lost in the regex and empty cells in the final concatenation):

f = dir('C:\directory');
f = regexpi({f.name},'.*txt|.*pdf','match');
f = [f{:}];

Also, note that the function dir() accepts wildcards (*) but not multiple extensions:

dir('C:\directory\*.avi')

This means you can retrieve immediately only those files that match an extension, however you have to loop for the number of extensions:

d   = 'C:\users\oleg\desktop';
ext = {'*.txt','*.pdf'};
f   = [];
for e = 1:numel(ext)
    f = [f; dir(fullfile(d,ext{e}))];
end

Alternative (not recommended)

ext = {'*.txt','*.pdf'};
str = ['!dir ' sprintf('%s ',ext{:}) '/B'];
textscan(evalc(str),'%s','Delimiter','')

where str is !dir *.txt *.pdf /B and evalc() captures the evaluation of the string and textscan() parses it.

like image 161
Oleg Avatar answered Oct 19 '22 23:10

Oleg