I am doing website development on OS X, and fairly often I find myself in situations where I move some part of a live website (running Linux/LAMP) to a development server running on my own machine. One such instance involves downloading images (user generated content, e.g. via ftp download), processing them in one way or another and the putting them back on the production site.
The image files involved, being created in a Linux machine, appears to have their filenames encoded in UTF-8 using NFC decomposition. OS X's HFS+ file system on the other hand does not allow NFC decomposed filenames and converts into NFD. However, once I am done and want to upload the files their names will now be using NFD decompositions, since Linux supports them both. As a result, the newly uploaded (and in some cases replaced) files will not be accessible at the expected URL.
I'm looking for a way to change the UTF decomposition of the files during (preferably) or after (convmv
looks like a good option, but I don't have sufficient permissions on this server it's not possible in this particular case) transfer, since I'm guessing it's impossible doing it beforehand. I've tried FTP-upload using Transmit and rsync (using a deploy script a normally use) to no avail. the --iconv
option in rsync seemed ideal, but unfortunately my server running rsync 2.6.9 did not recognize it.
I'm guessing quite a few people are having similar issues, I'll be happy to hear any solution or workaround!
UPDATE: In this case I ended up rsyncing the files to a virtual machine running Ubuntu, running convmv on them on there, and then rsyncing again to my staging server. While this works fairly well it is a bit time consuming. Perhaps it would be possible to mount an ext file system on OS X and just store the files there instead, using their original NFC decomposed file names?
Also, to avoid this problems all together on future WordPress installs, which was my use case, you could add a simple add_filter('sanitize_file_name', 'remove_accents');
before uploading any files and you should be fine.
It seems that rsync --iconv
is the best solution, as you can transfer the files and transcode the names all in one step. You just need to convince your host to upgrade their rsync. Given that the --iconv
feature was introduced in rsync 3.0.0, which was released in 2008, it's a bit odd that your host is still running rsync 2.6.9.
If you can't convince your host to install an up-to-date rsync, you could compile your own rsync, upload it somewhere like ~/bin
on the server, and add that to your path before the system installed rsync. Then you should be able to use the --iconv
option. This should work as long as you are using rsync over SSH (the default), not the rsync daemon; because rsync over SSH works by SSHing to the remote machine, and running rsync --server
with the same options that you passed to your local rsync.
Or you could find a host that has up-to-date tools and Perl installed.
Currently I'm using rsync --iconv
like this:
Given Linux server and OS X machine:
You should execute this command from server (it won't work from OS X):
rsync --iconv=UTF-8,UTF-8-MAC /home/username/path/on/server/ '[email protected]:/Users/username/path/on/machine/'
You should execute this command from machine:
rsync --iconv=UTF-8-MAC,UTF-8 /Users/username/path/on/machine/ '[email protected]:/home/username/path/on/server/'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With