Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to strip forbidden characters in file name from Unicode string [duplicate]

I have a string which contain some data I parse from the web, and make a file named after this data.

string = urllib.urlopen("http://example.com").read()
f = open(path + "/" + string + ".txt")
f.write("abcdefg")
f.close()

The problem is that it may include one of this characters: \ / * ? : " < > |. I'm using Windows, and it is forbidden to use those characters in a filename. Also, string is in Unicode formar which makes most of the solutions useless.

So, my question is: what is the most efficient / pythonic way to strip those characters? Thanks in advance!

Edit: the filename is in Unicode format not str!

like image 589
ohad987 Avatar asked Dec 25 '14 12:12

ohad987


People also ask

How do you remove illegal characters from path and filename?

You can simply use C# inbuilt function " Path. GetInvalidFileNameChars() " to check if there is invalid character in file name and remove it. var InvalidCharacters= Path. GetInvalidFileNameChars(); string GetInvalidCharactersRemovedString= new string(fileName .

Which special character should be avoided during naming a file?

You can name files using almost any character for a name, except for the following reserved characters: < > : " / \ | ? * The maximum length for a path is 255 characters. This limitation includes the drive letter, colon, backslash, directories, subdirectories, filename, and extension.


1 Answers

we dont know how your data look like:

But you can use re.sub:

import re
your_string = re.sub(r'[\\/*?:"<>|]',"","your_string")
like image 84
Hackaholic Avatar answered Oct 29 '22 14:10

Hackaholic