My script downloads files from the net and then it saves them under the name taken from the same web server. I need a filter/remover of invalid characters for file/folder names under Windows NTFS.
I would be happy for multi platform filter too.
NOTE: something like htmlentities
would be great....
Supported characters for a file name are letters, numbers, spaces, and ( ) _ - , . *Please note file names should be limited to 100 characters. Characters that are NOT supported include, but are not limited to: @ $ % & \ / : * ? " ' < > | ~ ` # ^ + = { } [ ] ; !
To clarify this answer, these special characters could interfere with parsing a command line (or path) if they were in a filename.
Same problem as with the "?". These are not invalid characters to Unix; typically only the NUL character and the / character are invalid filenames (the / being the directory separator).
Like Geo said, by using gsub
you can easily convert all invalid characters to a valid character. For example:
file_names.map! do |f|
f.gsub(/[<invalid characters>]/, '_')
end
You need to replace <invalid characters>
with all the possible characters that your file names might have in them that are not allowed on your file system. In the above code each invalid character is replaced with a _
.
Wikipedia tells us that the following characters are not allowed on NTFS:
(greater than)
So your gsub
call could be something like this:
file_names.map! { |f| f.gsub(/[\x00\/\\:\*\?\"<>\|]/, '_') }
which replaces all the invalid characters with an underscore.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With