I'm writing an application that uses renaming rules to rename a list of files based on information given by the user. The files may be inconsistently named to begin with, or the filenames may be consistent. The user selects a list of files, and inputs information about the files (for MP3s, they would be Artist, Title, Album, etc). Using a rename rule (example below), the program uses the user-inputted information to rename the files accordingly.
However, if all or some the files are named consistently, I would like to allow the program to 'guess' the file information. That is the problem I'm having. What is the best way to do this?
Sample filenames:
Kraftwerk-Kraftwerk-01-RuckZuck.mp3
Kraftwerk-Autobahn-01-Autobahn.mp3
Kraftwerk-Computer World-03-Numbers.mp3
Rename Rule:
%Artist%-%Album%-%Track%-%Title%.mp3
The program should properly deduce the Artist, Track number, Title, and Album name.
Again, what's the best way to do this? I was thinking regular expressions, but I'm a bit confused.
Easiest would be to replace each %Label%
with (?<Label>.*?)
, and escape any other characters.
%Artist%-%Album%-%Track%-%Title%.mp3
becomes
(?<Artist>.*?)-(?<Album>.*?)-(?<Track>.*?)-(?<Title>.*?)\.mp3
You would then get each component into named capture groups.
Dictinary<string,string> match_filename(string rule, string filename) {
Regex tag_re = new Regex(@'%(\w+)%');
string pattern = tag_re.Replace(Regex.escape(rule), @'(?<$1>.*?)');
Regex filename_re = new Regex(pattern);
Match match = filename_re.Match(filename);
Dictionary<string,string> tokens =
new Dictionary<string,string>();
for (int counter = 1; counter < match.Groups.Count; counter++)
{
string group_name = filename_re.GroupNameFromNumber(counter);
tokens.Add(group_name, m.Groups[counter].Value);
}
return tokens;
}
But if the user leaves out the delimiters, or if the delimiters could be contained within the fields, you could get some strange results. The pattern would for %Artist%%Album%
would become (?<Artist>.*?)(?<Album>.*?)
which is equivalent to .*?.*?
. The pattern wouldn't know where to split.
This could be solved if you know the format of certain fields, such as the track-number. If you translate %Track%
to (?<Track>\d+)
instead, the pattern would know that any digits in the filename must be the Track
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With