Some background information: We have an ancient web-based document database system where I work, almost entirely consisting of MS Office documents with the "normal" extensions (.doc, .xls, .ppt). They are all named based on some sort of arbitrary ID number (i.e. 1245.doc). We're switching to SharePoint and I need to rename all of these files and sort them into folders. I have a CSV file with all sorts of information (like which ID number corresponds to which document's title), so I'm using it to rename these files. I've written a short Python script that renames the ID number title.
However, some of the titles of the documents have slashes and other possibly bad characters to have in a title of a file, so I want to replace them with underscores:
bad_characters = ["/", "\\", ":", "(", ")", "<", ">", "|", "?", "*"]
for letter in bad_characters:
filename = line[2].replace(letter, "_")
foldername = line[5].replace(letter, "_")
line[2]
: "Blah blah boring - meeting 2/19/2008.doc"line[5]
: "Business meetings 2/2008"When I add print letter
inside of the for
loop, it will print out the letter it's supposed to be replacing, but won't actually replace that character with an underscore like I want it to.
Is there anything I'm doing wrong here?
You are facing this issue because you are using the replace method incorrectly. When you call the replace method on a string in python you get a new string with the contents replaced as specified in the method call. You are not storing the modified string but are just using the unmodified string.
To replace a string in Python, use the string. replace() method. The replace() method accepts three arguments and replaces the specified phrase with another specified phrase.
To replace a character in a String, without using the replace() method, try the below logic. Let's say the following is our string. int pos = 7; char rep = 'p'; String res = str. substring(0, pos) + rep + str.
That's because filename
and foldername
get thrown away with each iteration of the loop. The .replace()
method returns a string, but you're not saving the result anywhere.
You should use:
filename = line[2]
foldername = line[5]
for letter in bad_characters:
filename = filename.replace(letter, "_")
foldername = foldername.replace(letter, "_")
But I would do it using regex. It's cleaner and (likely) faster:
p = re.compile('[/:()<>|?*]|(\\\)')
filename = p.sub('_', line[2])
folder = p.sub('_', line[5])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With