Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to capture just filename (no url path, no extension)

In JavaScript, I can use this regex ([^\/]+)(\.[^\.\/]+)$ to capture just the filename in a URL. It works well in the following cases:

http://a.com/b/file.name.ext
http://a.com/b/file.name.ext#hash
http://a.com/b/file.name.ext?query

However it fails to match if there is no extension:

No match

http://a.com/b/filename
http://a.com/b/filename#hash
http://a.com/b/filename?query

This is normal. The second capturing group expects there to be a .ext chunk at the end.

If I make the second capturing group optional...

`([^\/]+)(\.[^\.\/]+)?$`

... then the first capturing group becomes greedy, and includes the .ext ending, which I don't want. How is the regex engine thinking about the optional second group? How can I make the existence of an extension optional?


NOTE: This regex is not intended for use with URLs with the following structure:

http://a.com/b/filename?query=a.b
http://a.com/b/filename.ext?query=a.b

In my case, dots will never appear later in the the URL.

like image 636
James Newton Avatar asked May 03 '15 13:05

James Newton


2 Answers

If you want pure regex (= nice and clean regular language expression from theoretical computer science, plus capturing groups), then you can do it with alternative groups:

([^\/.]+)$|([^\/]+)(\.[^\/.]+)$

and identify groups 1 and 2. Group 3 is the optional extension.

Another possibility:

([^\/.]+)(([^\/]*)(\.[^\/.]+))?$

Here you'd use group 4 as the extension, and the concatenation of groups 1 and 3 as the filename. Group 2 is only used to make the compound of 3 and 4 optional.

like image 143
Jo So Avatar answered Nov 16 '22 11:11

Jo So


Tested with:

http://a.com/b/file.name.ext
http://a.com/b/filename
http://a.com/b/filename#hash
http://a.com/b/filename?query

var file = "http://a.com/b/filename#hash";
function getFileName(url) {
    var index = url.lastIndexOf("/") + 1;
    var filenameWithExtension = url.substr(index);
    var filename = filenameWithExtension.split(".")[0]; 
    filename = filename.replace(/(#|\?).*?$/, "");
    return filename;                                   
}

alert(getFileName(file));
//filename

References:

lastindexof

split

substr

replace

like image 28
Pedro Lobito Avatar answered Nov 16 '22 12:11

Pedro Lobito