Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash script to remove 'x' amount of characters the end of multiple filenames in a directory?

Tags:

linux

bash

unix

sed

I have a list of file names in a directory (/path/to/local). I would like to remove a certain number of characters from all of those filenames.

Example filenames:

iso1111_plane001_00321.moc1
iso1111_plane002_00321.moc1
iso2222_plane001_00123.moc1

In every filename I wish to remove the last 5 characters before the file extension.

For example:

iso1111_plane001_.moc1
iso1111_plane002_.moc1
iso2222_plane001_.moc1

I believe this can be done using sed, but I cannot determine the exact coding. Something like...

for filename in /path/to/local/*.moc1; do
    mv $filname $(echo $filename | sed -e 's/.....^//');
done

...but that does not work. Sorry if I butchered the sed options, I do not have much experience with it.

like image 504
user2600230 Avatar asked Jul 19 '13 16:07

user2600230


2 Answers

 mv $filname $(echo $filename | sed -e 's/.....\.moc1$//');

or

 echo ${filename%%?????.moc1}.moc1

%% is a bash internal operator...

like image 177
Zoltán Haindrich Avatar answered Oct 23 '22 05:10

Zoltán Haindrich


This sed command will work for all the examples you gave.

sed -e 's/\(.*\)_.*\.moc1/\1_.moc1/'

However, if you just want to specifically "remove 5 characters before the last extension in a filename" this command is what you want:

sed -e 's/\(.*\)[0-9a-zA-Z]\{5\}\.\([^.]*\)/\1.\2/'

You can implement this in your script like so:

for filename in /path/to/local/*.moc1; do

    mv $filename "$(echo $filename | sed -e 's/\(.*\)[0-9a-zA-Z]\{5\}\.\([^.]*\)/\1.\2/')";

done

First Command Explanation

The first sed command works by grabbing all characters until the first underscore: \(.*\)_

Then it discards all characters until it finds .moc1: .*\.moc1

Then it replaces the text that it found with everything it grabbed at first inside the parenthesis: /\1

And finally adds the .moc1 extension back on the end and ends the regex: .moc1/

Second Command Explanation

The second sed command works by grabbing all characters at first: \(.*\)

And then it is forced to stop grabbing characters so it can discard five characters, or more specifically, five characters that lie in the ranges 0-9, a-z, and A-Z: [0-9a-zA-Z]\{5\}

Then comes the dot '.' character to mark the last extension : \.

And then it looks for all non-dot characters. This ensures that we are grabbing the last extension: \([^.]*\)

Finally, it replaces all that text with the first and second capture groups, separated by the . character, and ends the regex: /\1.\2/

like image 28
Cory Klein Avatar answered Oct 23 '22 06:10

Cory Klein