Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Special characters encoding in image filenames after server migration

I've migrated a WordPress website from a Hostgator shared host to a Ubuntu Digital Ocean LAMP stack.

The trouble started when I exported the image files which had special characters, for example the file operários_tarsila-1024x640.jpg.

When WordPress tries to reach the file, it displays an error. I've found the cause:

I can see via Inspect Element that Wordpress tries to call: http://mywebsite.com/wp-content/uploads/2013/02/oper%C3%A1rios_tarsila-1024x640.jpg and the server returns a 404 error.

However if I type this URL in the browser: http://mywebsite.com/wp-content/uploads/2013/02/opera%CC%81rios_tarsila-1024x640.jpg it works and the image is displayed.

So, it seems like this difference between the á encoding from %C3%A1 (á character) to a+%CC%81 (combining accute accent) is what is causing WordPress to not display my images.

So now I have in my server thousands of accented image filenames with the structure character+ combining accent and WordPress calling the image filenames with the structure accented character.

Is there a way bash rename all of them with a comparisson table? Or a way to make Apache aware of those differences and point to the right file when this kind of confusion happen?

like image 743
steps Avatar asked Oct 25 '15 03:10

steps


3 Answers

We have same problem - Mac + FileZilla + special characters in SK language.

Problem fixed using another FTP client (Cyberduck in our case ).

It seems to be a problem with FileZilla filenames encofing. Force utf8 encoding (FileZilla host settings) doesn't help.

like image 121
Branislav Avatar answered Nov 07 '22 08:11

Branislav


Apparently the problem is how the backup is decompressed on the new server.

There are 2 ways to fix this:

  1. Rename the files manually by names without accents and then modify the database and change the file names in the database (This maluco and can be dangerous, it would be best to back up the database).

  2. Upload files using Filezilla, but setting it to force the charset encoding in UTF-8.

File> Site Manager> {YOUR SITE}> Tab Charset> Force UTF-8

like image 9
Alorse Avatar answered Nov 07 '22 10:11

Alorse


So, just to touch upon this issue and a solution that worked for me... I also migrated a Wordpress site and found that all images with special characters in their filename produced a 404 after migration.

I ended up having to do the manual file renaming and edits to the database via phpMyAdmin. It was arduous and I definitely recommend backing up your database first.

In my case, I had a ton of media attachments that used the special character © in their filename.

First, I locally renamed the files by removing the character. I used 1-4a rename. Just found the filename and replaced it with nothing (not even a space). Then, I removed all the old files from the /wp-content/uploads/ folder and replaced them with the new files.

Next, I went into my database to update the table values. Media attachments have info stored in both the wp_posts and wp_postmeta tables. Below is the SQL I ran to update both -

update wp_posts set guid = replace(guid,'©','');

UPDATE wp_postmeta SET meta_value = REPLACE(meta_value, '©', '') 
WHERE LOWER(RIGHT(meta_value, 5)) = '.jpeg' OR 
LOWER(RIGHT(meta_value, 4)) IN ('.jpg', '.gif', '.png')

Which, again, we are replacing the character with nothing, not even a space.

I had to use the WP plugin Regenerate Thumbnails in order to have all of thumbnails + various attachment sizes update, but that did the trick.

I really appreciate everyone's efforts on this post and this post to help me figure it out! Hope this helps someone!

like image 2
RCNeil Avatar answered Nov 07 '22 08:11

RCNeil