I'm trying to apply control characters, such as '\x08 \x08' that should remove the precedent char, to a string (move backwards, write space, move backwards)
For example when I type into python console :
s = "test\x08 \x08"
print s
print repr(s)
I get in my terminal :
tes
'test\x08 \x08'
I'm looking for a function, let's says "function", that will 'apply' control characters to my string :
v = function("test\x08 \x08")
sys.stdout.write(v)
sys.stdout.write(repr(v))
so I get a "clean", control-characters-free string:
tes
tes
I understand that in a terminal, this part is handled by the client so maybe there is a way to get the displayed string, using core unix functions
echo -e 'test\x08 \x08'
cat file.out # control char are here handled by the client
>> tes
cat -v file.out # which prints the "actual" content of the file
>> test^H ^H
Escape sequences allow you to include special characters in strings. To do this, simply add a backslash ( \ ) before the character you want to escape.
In the telecommunication and computer domain, control characters are non-printable characters which are a part of the character set. These do not represent any written symbol. They are used in signaling to cause certain effects other than adding symbols to text.
The 'r' at the start of the pattern string designates a python "raw" string which passes through backslashes without change which is very handy for regular expressions (Java needs this feature badly!).
Use the syntax "\\" within the string literal to represent a single backslash.
Actually, the answer was a bit more complicated than a simple formatting.
Every character sent by the process to the terminal can be seen as a transition in a Finite State Machine (FSM). This FSM's state roughly corresponds to the sentence displayed and the cursor position, but there are many other variables such as the dimensions of the terminal, the current control sequence being inputted*, the terminal mode (ex: VI mode / classic BASH console), etc.
An good implementation of this FSM can be seen in the pexpect source code.
To answer my question, there is no core unix "function" that can format the string to what is displayed in the terminal, since such a function is specific to the terminal that renders process' output and you would have to rewrite a full terminal to handle every possible character and control sequence.
However we can implement a simple one ourselves. We need to define a FSM with an initial state :
and transitions (input characters):
\x08
hex code: decrements the cursor positionand feed it the string.
def decode(input_string):
# Initial state
# String is stored as a list because
# python forbids the modification of
# a string
displayed_string = []
cursor_position = 0
# Loop on our input (transitions sequence)
for character in input_string:
# Alphanumeric transition
if str.isalnum(character) or str.isspace(character):
# Add the character to the string
displayed_string[cursor_position:cursor_position+1] = character
# Move the cursor forward
cursor_position += 1
# Backward transition
elif character == "\x08":
# Move the cursor backward
cursor_position -= 1
else:
print("{} is not handled by this function".format(repr(character)))
# We transform our "list" string back to a real string
return "".join(displayed_string)
And an example
>>> decode("test\x08 \x08")
tes
An ANSI control sequence is a set of characters that act as a transition on the (display/cursor/terminal mode/...) state of the terminal. It can be seen as a refinement of our FSM state and transitions with more sub-states and sub-transitions.
For example: when you press the UP key in a classic Unix terminal (such as the VT100), you actually enter the control sequence: ESC 0 A
where ESC
is hex code \x1b
. ESC
transitions to ESCAPE mode, and it returns to normal mode after A.
Some processes interpret this sequence as a move of the vertical cursor position (VI), others as a move backward in the history (BASH) : it depends fully on the program that handles the input.
However, the same sequence can be used the output process but it will most likely move the cursor up in the screen : it depends on the terminal implementation.
A good list of ANSI control sequences is available here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With