Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Highlighting and replacing non-printable unicode characters in Emacs

Tags:

emacs

unicode

I have an UTF-8 file containing some Unicode characters like LEFT-TO-RIGHT OVERRIDE (U+202D) which I want to remove from the file. In Emacs, they are hidden (which should be the correct behavior?) by default. How do I make such "exotic" unicode characters visible (while not changing display of "regular" unicode characters like german umlauts)? And how do I replace them afterwards (with replace-string for example. C-X 8 Ret does not work for isearch/replace-string).

In Vim, its quite easy: These characters are displayed with their hex representation per default (is this a bug or missing feature?) and you can easily remove them with :%s/\%u202d//g for example. This should be possible with Emacs?

like image 639
Christian Avatar asked Sep 26 '11 19:09

Christian


People also ask

How do I enter Unicode characters in Emacs?

You can use command 'insert-char' ( 'ucs-insert' for Emacs before version 24) ( 'C-x 8 RET' ), which lets you either enter a Unicode code point or complete against the Unicode character name.

What is a non Unicode character?

What is Non-Unicode? Non-Unicode is a term used to refer to modules or character encodings that do not support the Unicode standard. ACL Desktop and AuditExchange are available in both non-Unicode and Unicode Editions.


1 Answers

How about this:

Put the U+202d character you want to match at the top of the kill ring by typing M-:(kill-new "\u202d"). Then you can yank that string into the various searching commands, with either C-y (eg. query-replace) or M-y (eg. isearch-forward).

(Edited to add:)

You could also just call commands non-interactively, which doesn't present the same keyboard-input difficulties as the interactive calls. For example, type M-: and then:

(replace-string "\u202d" "")

This is somewhat similar to your Vim version. One difference is that it only performs replacements from the cursor position to the bottom of the file (or narrowed region), so you'd need to go to the top of the file (or narrowed region) prior to running the command to replace all matches.

like image 112
Sean Avatar answered Sep 17 '22 21:09

Sean