Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

rename multiple files splitting filenames by '_' and retaining first and last fields

Say I have the following files:

a_b.txt               a_b_c.txt             a_b_c_d_e.txt         a_b_c_d_e_f_g_h_i.txt

I want to rename them in such a way that I split their filenames by _ and I retain the first and last field, so I end up with:

a_b.txt               a_c.txt             a_e.txt         a_i.txt

Thought it would be easy, but I'm a bit stuck...

I tried rename with the following regexp:

rename 's/^([^_]*).*([^_]*[.]txt)/$1_$2/' *.txt

But what I would really need to do is to actually split the filename, so I thought of awk, but I'm not so proficient with it... This is what I have so far (I know at some point I should specify FS="_" and grab the first and last field somehow...

find . -name "*.txt" | awk -v mvcmd='mv "%s" "%s"\n' '{old=$0; <<split by _ here somehow and retain first and last fields>>; printf mvcmd,old,$0}'

Any help? I don't have a preferred method, but it would be nice to use this to learn awk. Thanks!

like image 934
DaniCee Avatar asked Aug 30 '21 10:08

DaniCee


2 Answers

Your rename attempt was close; you just need to make sure the final group is greedy.

rename 's/^([^_]*).*_([^_]*[.]txt)$/$1_$2/' *_*_*.txt

I added a _ before the last opening parenthesis (this is the crucial fix), and a $ anchor at the end, and also extended the wildcard so that you don't process any files which don't contain at least two underscores.

The equivalent in Awk might look something like

find . -name "*_*_*.txt" |
awk -F _ '{ system("mv " $0 " " $1 "_" $(NF)) }'

This is somewhat brittle because of the system call; you might need to rethink your approach if your file names could contain whitespace or other shell metacharacters. You could add quoting to partially fix that, but then the command will fail if the file name contains literal quotes. You could fix that, too, but then this will be a little too complex for my taste.

Here's a less brittle approach which should cope with completely arbitrary file names, even ones with newlines in them:

find . -name "*_*_*.txt" -exec sh -c 'for f; do
    mv "$f" "${f%%_*}_${f##*_}"
  done' _ {} +

find will supply a leading path before each file name, so we don't need mv -- here (there will never be a file name which starts with a dash).

The parameter expansion ${f##pattern} produces the value of the variable f with the longest available match on pattern trimmed off from the beginning; ${f%%pattern} does the same, but trims from the end of the string.

like image 191
tripleee Avatar answered Dec 08 '22 10:12

tripleee


With your shown samples, please try following pure bash code(with great use parameter expansion capability of BASH). This will catch all files with name/format .txt in their name. Then it will NOT pick files like: a_b.txt it will only pick files which have more than 1 underscore in their name as per requirement.

for file in *_*_*.txt
do
   firstPart="${file%%_*}"
   secondPart="${file##*_}"
   newName="${firstPart}_${secondPart}"
   mv -- "$file"  "$newName"
done
like image 34
RavinderSingh13 Avatar answered Dec 08 '22 12:12

RavinderSingh13