Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the use for control characters in string.printable?

I was learning about the format mini-language and I scrolled up to view some string things, and I wondered what Python considered printable, so I checked:

>>> string.printable
'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'

Note at the end, along with other normal printable control-type characters like ^I, ^J, and ^M, there's also ^B and ^C.

What would be the use for these?

like image 546
Richard Haley Avatar asked Oct 16 '19 20:10

Richard Haley


People also ask

What are control characters used for?

In computing and telecommunication, a control character or non-printing character (NPC) is a code point (a number) in a character set, that does not represent a written symbol. They are used as in-band signaling to cause effects other than the addition of a symbol to the text.

What is the mean of printable and control characters in the keyboard?

A character is known as printable character if it occupies printing space. For the standard ASCII character set (used by the “C” locale), control characters are those between ASCII codes 0x00 (NUL) and 0x1f (US), plus 0x7f (DEL).

What is control character in Word?

Definitions of control character. ASCII characters to indicate carriage return or tab or backspace; typed by depressing a key and the control key at the same time.

Which function is used to check all the characters in a string is printable?

The isprintable() method returns “True” if all characters in the string are printable or the string is empty, Otherwise, It returns “False”.


3 Answers

Those symbols are printable whitespaces (can also be obtained by string.whitespace). \x0b or \v is vertical tab and \x0c (\f) which is form feed (docs).

In various terminals the representation can be different, but usually \v looks like following:

>>> print("some\vtext\vhere")
some
    text
        here

\f forces the printer to eject the current page and to continue printing at the top of another. In terminal often displays as an empty line. Sometimes can be used to clear the screen.

That's why these symbols are considered as printable. However, representation and behavior may be quite different

like image 149
Oleh Rybalchenko Avatar answered Oct 13 '22 01:10

Oleh Rybalchenko


\x0b is the U+000B, LINE TABULATION, and \x0c is U+000C, FORM FEED (FF).

Great. What does that mean, and why are they considered printable?

Back in the days of teletypes, ASCII provided these characters for advancing a sheet of paper through the printer. The line tabulation character was to a line feed like a horizontal tab was to a space. The printer could have a set of defined vertical tab stops, and a line tabulation character would be interpreted as a request to advance to the next one following the current line.

A form feed would advance to the top of the next page, however the device defined a page. (If it used a continuous paper feed, a "page" would be considered, say, 66 lines, and if you were currently on line 60, a form feed would simply advance 7 lines, to line 1 of the next page.)

On a modern terminal emulator, they tend not to have any particular meaning. Simple tests indicate that on an xterm, they both appear to be treated the same as a line feed followed by a space:

>>> print("x\x0cy")
x
 y
>>> print("x\x0by")
x
 y
>>>

Update: after seeing https://stackoverflow.com/a/58421132/1126841, it would appear isn't not just a space following the line feed, but rather a line feed not followed by a carriage return, i.e., the cursor advances one line without returning to the beginning of that line; compare with print("\n"), which advances to the beginning of the next line, regardless of where the cursor currently is. (That makes more sense, and I should have remembered that.)

like image 26
chepner Avatar answered Oct 13 '22 02:10

chepner


These are types of whitespace characters:

>>> string.whitespace
' \t\n\r\x0b\x0c'

From the docs:

A string containing all ASCII characters that are considered whitespace. This includes the characters space, tab, linefeed, return, formfeed, and vertical tab.

We know this by following the docs for printable:

String of ASCII characters which are considered printable. This is a combination of digits, ascii_letters, punctuation, and whitespace.

And we know what these whitespace characters are by referencing unicode control characters:
\x0b is a vertical tab
\x0c is a form feed

like image 21
MyNameIsCaleb Avatar answered Oct 13 '22 02:10

MyNameIsCaleb