Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the Unicode codepoint of a character in GNU Emacs

Tags:

emacs

unicode

In XEmacs this is done by the calling the function char-to-ucs on a character. GNU Emacs does not seem to have this function. In GNU Emacs, characters seem to be ordinary integers. Running C-x = on a latin character reveals that the Emacs codepoint is different from the Unicode codepoint for the corresponding character. How do I find the Unicode codepoint of the character at point in GNU Emacs?


1 Answers

In a modern Emacs, M-x describe-char will tell you about the character at point.
An example:

  character: ¢ (2210, #o4242, #x8a2, U+00A2)
    charset: latin-iso8859-1
         (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100.)
 code point: #x22
     syntax: w  which means: word
   category: l:Latin
buffer code: #x81 #xA2
  file code: #xC2 #xA2 (encoded by coding system utf-8)
    display: by this font (glyph code)
     -apple-monaco-medium-r-normal--12-120-72-72-m-120-mac-roman (#xA2)

Note the U+00A2 in the first part, which gives the Unicode codepoint of the character.

like image 183
Dwight Holman Avatar answered Sep 09 '25 19:09

Dwight Holman