I have lot of xml files which are named like:
First_ExampleXML_Only_This_Should_Be_Name_20211234567+1234565.xml
Second_ExampleXML_OnlyThisShouldBeName_202156789+55684894.xml
Third_ExampleXML_Only_This_Should_Be_Name1_2021445678+6963696.xml
Fourth_ExampleXML_Only_This_Should_Be_Name2_20214567+696656.xml
I have to make a script that will go through all of the files and rename them, so only this is left from the example:
Only_This_Should_Be_Name.xml
OnlyThisShouldBeName.xml
Only_This_Should_Be_Name1xml
Only_This_Should_Be_Name2.xml
At the moment I have something like this but really struggling to get exactly what I need, guess that have to count from second _ up to _202, and take everything in between.
fnames = listdir('.')
for fname in fnames:
# replace .xml with type of file you want this to have impact on
if fname.endswith('.xml):
Anyone has idea what would be the best approach to do it?
You can strip the contents by splitting with underscores for all xml files and rename with the first value in the list as below.
import os
fnames = os.listdir('.')
for fname in fnames:
# replace .xml with type of file you want this to have impact on
if fname.endswith('.xml'):
newName = '_'.join(fname.split("_")[2:-1])
os.rename(fname, newName+".xml")
else:
continue
here you are eliminating the values which are before and after "_".
There are two problems here:
Whilst listdir will work, you might as well glob them:
from pathlib import Path
for fn in Path("/path").glob("*.xml"):
....
In this case your files are named "file_name_NUMBERS.xml" and we want to strip the numbers out, so we'll use a regex: Edit: this is not the best way in this case. Just split and combine as in the other answer
import re
from pathlib import Path
for fn in Path("dir").glob("*.xml"):
new_name = re.search(r"(.*?)_[0-9]+", fn.stem).group(1)
fn.rename(fn.with_name(new_name + ".xml"))
Edit: don't know why I overcomplicted things. I'll leave the re solution there for more difficult cases, but in this case you can just do:
new_name = "_".join(fn.stem.split("_")[:-1])
Which is greately superior as it doesn't depend on the precise naming of the files.
Note that you can do all this without pathlib, but you asked for the best way ;)
Lastly, to answer an implicit question, nothing stops you wrapping all this in a function and passing an argument to glob for different types of files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With