Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to escape unicode characters in bash prompt correctly

I have a specific method for my bash prompt, let's say it looks like this:

CHAR="༇ "
my_function="
    prompt=\" \[\$CHAR\]\"
    echo -e \$prompt"

PS1="\$(${my_function}) \$ "

To explain the above, I'm builidng my bash prompt by executing a function stored in a string, which was a decision made as the result of this question. Let's pretend like it works fine, because it does, except when unicode characters get involved

I am trying to find the proper way to escape a unicode character, because right now it messes with the bash line length. An easy way to test if it's broken is to type a long command, execute it, press CTRL-R and type to find it, and then pressing CTRL-A CTRL-E to jump to the beginning / end of the line. If the text gets garbled then it's not working.

I have tried several things to properly escape the unicode character in the function string, but nothing seems to be working.

Special characters like this work:

COLOR_BLUE=$(tput sgr0 && tput setaf 6)

my_function="
    prompt="\\[\$COLOR_BLUE\\] \"
    echo -e \$prompt"

Which is the main reason I made the prompt a function string. That escape sequence does NOT mess with the line length, it's just the unicode character.

like image 616
Andy Ray Avatar asked Aug 18 '11 19:08

Andy Ray


People also ask

How do I type Unicode characters in Linux?

In X11 (Linux and other Unix variants including ChromeOS) In many applications one or both of the following methods work to directly input Unicode characters: Holding Ctrl + ⇧ Shift and typing u followed by the hex digits, then releasing Ctrl + ⇧ Shift .

What is a shell escape sequence?

An escape sequence produces an EBCDIC version of the ASCII control sequence. (For example, the z/OS UNIX <EscChar-D> corresponds to the ASCII <Ctrl-D>.) You can use escape sequences to type: Portable characters not included on your keyboard; see Escape sequences for a 3270 keyboard for these sequences.

What are the prompt characters used in Linux?

The $ symbol is the default for regular users. If you're logged in as the 'root' user, the full prompt changes to [root@localhost ~]#. The # symbol is the prompt designation for the root account. The general format of the default command prompt is: [username@hostname cwd]$ or #.


1 Answers

The \[...\] sequence says to ignore this part of the string completely, which is useful when your prompt contains a zero-length sequence, such as a control sequence which changes the text color or the title bar, say. But in this case, you are printing a character, so the length of it is not zero. Perhaps you could work around this by, say, using a no-op escape sequence to fool Bash into calculating the correct line length, but it sounds like that way lies madness.

The correct solution would be for the line length calculations in Bash to correctly grok UTF-8 (or whichever Unicode encoding it is that you are using). Uhm, have you tried without the \[...\] sequence?

Edit: The following implements the solution I propose in the comments below. The cursor position is saved, then two spaces are printed, outside of \[...\], then the cursor position is restored, and the Unicode character is printed on top of the two spaces. This assumes a fixed font width, with double width for the Unicode character.

PS1='\['"`tput sc`"'\]  \['"`tput rc`"'༇ \] \$ '

At least in the OSX Terminal, Bash 3.2.17(1)-release, this passes cursory [sic] testing.

In the interest of transparency and legibility, I have ignored the requirement to have the prompt's functionality inside a function, and the color coding; this just changes the prompt to the character, space, dollar prompt, space. Adapt to suit your somewhat more complex needs.

like image 168
tripleee Avatar answered Oct 21 '22 17:10

tripleee