Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get terminal's Character Encoding

Tags:

linux

People also ask

How do you set character encoding?

Set Character Encoding. Provide right-click menu to manually set character encoding for web pages. Right-click at somewhere on web page to manually set character encoding. The selected character set will automatically apply to all pages on the same site.

Is UTF-8 character set or encoding?

UTF-8 is a character encoding system. It lets you represent characters as ASCII text, while still allowing for international characters, such as Chinese characters. As of the mid 2020s, UTF-8 is one of the most popular encoding systems.


The terminal uses environment variables to determine which character set to use, therefore you can determine it by looking at those variables:

echo $LC_CTYPE

or

echo $LANG

locale command with no arguments will print the values of all of the relevant environment variables except for LANGUAGE.

For current encoding:

locale charmap

For available locales:

locale -a

For available encodings:

locale -m

Check encoding and language:

$ echo $LC_CTYPE
ISO-8859-1
$ echo $LANG
pt_BR

Get all languages:

$ locale -a

Change to pt_PT.utf8:

$ export LC_ALL=pt_PT.utf8 
$ export LANG="$LC_ALL"

If you have Python:

python -c "import sys; print(sys.stdout.encoding)"

To my knowledge, no.

Circumstantial indications from $LC_CTYPE, locale and such might seem alluring, but these are completely separated from the encoding the terminal application (actually an emulator) happens to be using when displaying characters on the screen.

They only way to detect encoding for sure is to output something only present in the encoding, e.g. ä, take a screenshot, analyze that image and check if the output character is correct.

So no, it's not possible, sadly.


To see the current locale information use locale command. Below is an example on RHEL 7.8

[usr@host ~]$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=