I need to sanitize some data which will be used in file names. Some of the data contains spaces and ampersand characters. Is there a function which will escape or sanitize data suitable for using in a file name (or path)? I couldn't find one in the 'Filesystem Function' section of the PHP manual.
So, assuming I have to write my own function, which characters do I need to escape (or change)?
Instead of filtering out characters why not just allow [a-z0-9- !@#$%^()]
? It is certainly easier than trying to guess every character that could potentially cause problems.
Your users shouldn't need a file with any other characters anyways, right?
For Windows:
/ \ : * ? " < > |
For Unix, technically nothing, but in practice the same list as Windows would be sensible.
There's nothing wrong with spaces or ampersands as long as you're prepared to use quotes on command lines when you're manipulating the files.
(BTW, I got that list by trying to rename a file on Windows to something including a colon, and copying from the error message.)
If you have the opportunity to store the original name in a database I would simply create a file with a random hash (mt_rand()/md5/sha1). The benefit would be that you don't rely on the underlying OS (characters/path length), the value or the length of the user input and additionally it is really hard to guess/forge a file name. Maybe even a base64 encoding is an option.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With