Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match filename and file extension from single Regex

I'm sure this must be easy enough, but I'm struggling...

var regexFileName = /[^\\]*$/; // match filename
var regexFileExtension = /(\w+)$/; // match file extension

function displayUpload() {
    var path = $el.val(); //This is a file input
    var filename = path.match(regexFileName); // returns  file name
    var extension = filename[0].match(regexFileExtension); // returns extension

    console.log("The filename is " + filename[0]);
    console.log("The extension is " + extension[0]);
}

The function above works fine, but I'm sure it must be possible to achieve with a single regex, by referencing different parts of the array returned with the .match() method. I've tried combining these regex but without success.

Also, I'm not using a string to test it on in the example, as console.log() escapes the backslashes in a filepath and it was starting to confuse me :)

like image 804
Tom Bates Avatar asked Jan 25 '12 10:01

Tom Bates


People also ask

How do you match a name in regex?

p{L} => matches any kind of letter character from any language. p{N} => matches any kind of numeric character. *- => matches asterisk and hyphen. + => Quantifier — Matches between one to unlimited times (greedy)

What does regex (? S match?

3.6. (? i) makes the regex case insensitive. (? s) for "single line mode" makes the dot match all characters, including line breaks.

Why * is used in regex?

* - means "0 or more instances of the preceding regex token"

What is G and GI in regex?

Anything after the closing delimiter is called a "modifier," in this case g and i . The g and i modifiers have these meanings: g = global, match all instances of the pattern in a string, not just one. i = case-insensitive (so, for example, /a/i will match the string "a" or "A" .


2 Answers

I know this is an old question, but here's another solution that can handle multiple dots in the name and also when there's no extension at all (or an extension of just '.'):
/^(.*?)(\.[^.]*)?$/

Taking it a piece at a time:
^
Anchor to the start of the string (to avoid partial matches)

(.*?)
Match any character ., 0 or more times *, lazily ? (don't just grab them all if the later optional extension can match), and put them in the first capture group ( ).

(\.
Start a 2nd capture group for the extension using (. This group starts with the literal . character (which we escape with \ so that . isn't interpreted as "match any character").

[^.]*
Define a character set []. Match characters not in the set by specifying this is an inverted character set ^. Match 0 or more non-. chars to get the rest of the file extension *. We specify it this way so that it doesn't match early on filenames like foo.bar.baz, incorrectly giving an extension with more than one dot in it of .bar.baz instead of just .baz. . doesn't need escaped inside [], since everything (except^) is a literal in a character set.

)?
End the 2nd capture group ) and indicate that the whole group is optional ?, since it may not have an extension.

$
Anchor to the end of the string (again, to avoid partial matches)

If you're using ES6 you can even use destructing to grab the results in 1 line:
[,filename, extension] = /^(.*?)(\.[^.]*)?$/.exec('foo.bar.baz'); which gives the filename as 'foo.bar' and the extension as '.baz'.
'foo' gives 'foo' and ''
'foo.' gives 'foo' and '.'
'.js' gives '' and '.js'

like image 162
Mark Smith Avatar answered Oct 06 '22 16:10

Mark Smith


Assuming that all files do have an extension, you could use

var regexAll = /[^\\]*\.(\w+)$/;

Then you can do

var total = path.match(regexAll);
var filename = total[0];
var extension = total[1];
like image 24
Tim Pietzcker Avatar answered Oct 06 '22 16:10

Tim Pietzcker