Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sublime Text: Not representable characters

I'm using Sublime Text for Latex, so i need to use a specific encoding. However, in some cases, when I paste text copied from a different program (word/browser in most cases), I'm getting the message:

"Not all characters are representable in XXX encoding, falling back to UTF-8"

My question is: Is there any way to see which parts of the text cannot be encoded, so I can delete them manually?

like image 806
blue_note Avatar asked Sep 09 '14 10:09

blue_note


3 Answers

I had this problem. It is caused by corrupt characters in your document. Here is how i solved it.

1) Make a search in your document for all standard characters. Make sure you enable regular expressions in your search, then paste this :

[^a-zA-Z0-9 -\.;<>/ ={}\[\]\^\?_\\\|:\r\n@]

You can add to that the normal accented characters of your language, here are the characters for French and German. Such as éà and so on :

[^a-zA-Z0-9 -\.;<>/ ='{}\[\]\^\?_\\\|:\r\n~@éàèêîôâûçäöüÄÖÜß]

2) Search for that, and Keep pressing F3 until you see mangled characters. Usually something like "è" which is a corrupt version of "à".

3) Delete those characters or replace them with what they should be.

You will be able to convert the document to another encoding when you have cleared all corrupt characters out.

like image 66
Draken Avatar answered Oct 24 '22 08:10

Draken


For Linux users, it's also possible to automatically remove broken characters with command iconv:

iconv -f UTF-8 -t Windows-1251 -c < ~/temp/data.csv > ~/temp/data01.csv

-c Silently discard characters that cannot be converted instead of terminating when encountering such characters.

like image 2
LexeY4eg Avatar answered Oct 24 '22 07:10

LexeY4eg


Just adding to @Draken response: here is the RegEx with spanish characters added.

[^a-zA-Z0-9 -\.;<>/ =“”'{}\[\]\^\?_\\\|:\r\n~@àèêîôâûçäöüÄÖÜßáéíóúñÑ¿€]

In my case I hitted Ctrl+H (for replacement) and as a replacement expression used nothing. So everything got cleared super fast and I was able to save it using ISO-8859-1.

like image 1
Juan Javier Triff Cabanas Avatar answered Oct 24 '22 07:10

Juan Javier Triff Cabanas