I want to create a sane/safe filename (i.e. somewhat readable, no "strange" characters, etc.) from some random Unicode string (which might contain just anything).
(It doesn't matter for me whether the function is Cocoa, ObjC, Python, etc.)
Of course, there might be infinite many characters which might be strange. Thus, it is not really a solution to have a blacklist and to add more and more to that list over the time.
I could have a whitelist. However, I don't really know how to define it. [a-zA-Z0-9 .]
is a start but I also want to accept unicode chars which can be displayed in a normal way.
Python:
"".join([c for c in filename if c.isalpha() or c.isdigit() or c==' ']).rstrip()
this accepts Unicode characters but removes line breaks, etc.
example:
filename = u"ad\nbla'{-+\)(ç?"
gives: adblaç
edit str.isalnum() does alphanumeric on one step. – comment from queueoverflow below. danodonovan hinted on keeping a dot included.
keepcharacters = (' ','.','_') "".join(c for c in filename if c.isalnum() or c in keepcharacters).rstrip()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With