Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linux terminal input: reading user input from terminal truncating lines at 4095 character limit

In a bash script, I try to read lines from standard input, using built-in read command after setting IFS=$'\n'. The lines are truncated at 4095 character limit if I paste input to the read. This limitation seems to come from reading from terminal, because this worked perfectly fine:

fill=
for i in $(seq 1 94); do fill="${fill}x"; done
for i in $(seq 1 100); do printf "%04d00$fill" $i; done | (read line; echo $line)

I experience the same behavior with Python script (did not accept longer than 4095 input from terminal, but accepted from pipe):

#!/usr/bin/python

from sys import stdin

line = stdin.readline()
print('%s' % line)

Even C program works the same, using read(2):

#include <stdio.h>
#include <unistd.h>

int main(void)
{
    char buf[32768];
    int sz = read(0, buf, sizeof(buf) - 1);
    buf[sz] = '\0';
    printf("READ LINE: [%s]\n", buf);
    return 0;
}

In all cases, I cannot enter longer than about 4095 characters. The input prompt stops accepting characters.

Question-1: Is there a way to interactively read from terminal longer than 4095 characters in Linux systems (at least Ubuntu 10.04 and 13.04)?

Question-2: Where does this limitation come from?

Systems affected: I noticed this limitation in Ubuntu 10.04/x86 and 13.04/x86, but Cygwin (recent version at least) does not truncate yet at over 10000 characters (did not test further since I need to get this script working in Ubuntu). Terminals used: Virtual Console and KDE konsole (Ubuntu 13.04) and gnome-terminal (Ubuntu 10.04).

like image 295
FooF Avatar asked Aug 02 '13 10:08

FooF


1 Answers

Please refer to termios(3) manual page, under section "Canonical and noncanonical mode".

Typically, the terminal (standard input) is in canonical mode; in this mode the kernel will buffer the input line before returning the input to the application. The hard-coded limit for Linux (N_TTY_BUF_SIZE defined in ${linux_source_path}/include/linux/tty.h) is set to 4096 allowing input of 4095 characters not counting the ending new line. You can also have a look at file ${linux_source_path}/drivers/tty/n_tty.c, function n_tty_receive_buf_common() and the comment above that.

In noncanonical mode there will by default be no buffering by kernel and the read(2) system call returns immediately once a single character of input is returned (key is pressed). You can manipulate the terminal settings to read a specified amount of characters or set a time-out for non-canonical mode, but then too the hard-coded limit is 4095 per the termios(3) manual page (and the comment above the above mentioned n_tty_receive_buf_common()).

Bash read builtin command still works in non-canonical mode as can be demonstrated by the following:

IFS=$'\n'      # Allow spaces and other white spaces.
stty -icanon   # Disable canonical mode.
read line      # Now we can read without inhibitions set by terminal.
stty icanon    # Re-enable canonical mode (assuming it was enabled to begin with).

After this modification of adding stty -icanon you can paste longer than 4096 character string and read it successfully using bash built-in read command (I successfully tried longer than 10000 characters).

If you put this in a file, i.e. make it a script, you can use strace to see the system calls called, and you will see read(2) called multiple times, each time returning a single character when you type input to it.

like image 127
FooF Avatar answered Sep 16 '22 21:09

FooF