I'm trying to decide which one to use when I need to acquire lines of input from STDIN, so I wonder how I need to choose them in different situations. I found a previous post (https://codereview.stackexchange.com/questions/23981/how-to-optimize-this-simple-python-program) saying that: <blockquote> How can I optimize this code in terms of time and memory used? Note that I'm using different function to read the input, as sys.stdin.readline() is the fastest one when reading strings and input() when reading integers. </blockquote> Is that statement true ?

The builtin <code>input</code> and <code>sys.stdin.readline</code> functions don't do exactly the same thing, and which one is faster may depend on the details of exactly what you're doing. As aruisdante commented, the difference is less in Python 3 than it was in Python 2, when the quote you provide was from, but there are still some differences. The first difference is that <code>input</code> has an optional prompt parameter that will be displayed if the interpreter is running interactively. This leads to some overhead, even if the prompt is empty (the default). On the other hand, it may be faster than doing a <code>print</code> before each <code>readline</code> call, if you do want a prompt. The next difference is that <code>input</code> strips off any newline from the end of the input. If you're going to strip that anyway, it may be faster to let <code>input</code> do it for you, rather than doing <code>sys.stdin.readline().strip()</code>. A final difference is how the end of the input is indicated. <code>input</code> will raise an <code>EOFError</code> when you call it if there is no more input (stdin has been closed on the other end). <code>sys.stdin.readline</code> on the other hand will return an empty string at EOF, which you need to know to check for. There's also a third option, using the file iteration protocol on <code>sys.stdin</code>. This is likely to be much like calling <code>readline</code>, but perhaps nicer logic to it. I suspect that while differences in performance between your various options may exist, they're liky to be smaller than the time cost of simply reading the file from the disk (if it is large) and doing whatever you are doing with it. I suggest that you avoid the trap of premature optimization and just do what is most natural for your problem, and if the program is too slow (where "too slow" is very subjective), you do some profiling to see what is taking the most time. Don't put a whole lot of effort into deciding between the different ways of taking input unless it actually matters.

As Linn1024 says, for reading large amounts of data <code>input()</code> is much slower. A simple example is this: <pre class="prettyprint"><code>import sys for i in range(int(sys.argv[1])): sys.stdin.readline() </code></pre> This takes about <code>0.25μs</code> per iteration: <pre class="prettyprint"><code>$ time yes | py readline.py 1000000 yes 0.05s user 0.00s system 22% cpu 0.252 total </code></pre> Changing that to <code>sys.stdin.readline().strip()</code> takes that to about <code>0.31μs</code>. Changing <code>readline()</code> to <code>input()</code> is about 10 times slower: <pre class="prettyprint"><code>$ time yes | py input.py 1000000 yes 0.05s user 0.00s system 1% cpu 2.855 total </code></pre> Notice that it's still pretty fast though, so you only really need to worry when you are reading thousands of entries like above.

sys.stdin.readline() and input(): which one is faster when reading lines of input, and why?

Tags:

python

python-3.x

I'm trying to decide which one to use when I need to acquire lines of input from STDIN, so I wonder how I need to choose them in different situations.

I found a previous post (https://codereview.stackexchange.com/questions/23981/how-to-optimize-this-simple-python-program) saying that:

How can I optimize this code in terms of time and memory used? Note that I'm using different function to read the input, as sys.stdin.readline() is the fastest one when reading strings and input() when reading integers.

Is that statement true ?

469

asked Mar 25 '14 00:03

QzThrone

2 Answers

The builtin input and sys.stdin.readline functions don't do exactly the same thing, and which one is faster may depend on the details of exactly what you're doing. As aruisdante commented, the difference is less in Python 3 than it was in Python 2, when the quote you provide was from, but there are still some differences.

The first difference is that input has an optional prompt parameter that will be displayed if the interpreter is running interactively. This leads to some overhead, even if the prompt is empty (the default). On the other hand, it may be faster than doing a print before each readline call, if you do want a prompt.

The next difference is that input strips off any newline from the end of the input. If you're going to strip that anyway, it may be faster to let input do it for you, rather than doing sys.stdin.readline().strip().

A final difference is how the end of the input is indicated. input will raise an EOFError when you call it if there is no more input (stdin has been closed on the other end). sys.stdin.readline on the other hand will return an empty string at EOF, which you need to know to check for.

There's also a third option, using the file iteration protocol on sys.stdin. This is likely to be much like calling readline, but perhaps nicer logic to it.

I suspect that while differences in performance between your various options may exist, they're liky to be smaller than the time cost of simply reading the file from the disk (if it is large) and doing whatever you are doing with it. I suggest that you avoid the trap of premature optimization and just do what is most natural for your problem, and if the program is too slow (where "too slow" is very subjective), you do some profiling to see what is taking the most time. Don't put a whole lot of effort into deciding between the different ways of taking input unless it actually matters.

113

answered Sep 26 '22 02:09

Blckknght

As Linn1024 says, for reading large amounts of data input() is much slower. A simple example is this:

Click to copy

import sys
for i in range(int(sys.argv[1])):
    sys.stdin.readline()

This takes about 0.25μs per iteration:

Click to copy

$ time yes | py readline.py 1000000
yes  0.05s user 0.00s system 22% cpu 0.252 total

Changing that to sys.stdin.readline().strip() takes that to about 0.31μs.

Changing readline() to input() is about 10 times slower:

Click to copy

$ time yes | py input.py 1000000
yes  0.05s user 0.00s system 1% cpu 2.855 total

Notice that it's still pretty fast though, so you only really need to worry when you are reading thousands of entries like above.

answered Sep 25 '22 02:09

Thomas Ahle

Related questions
                            
                                using sudo inside jupyter notebook's cell
                            
                                Finding matching interval(s) in pandas Intervalindex
                            
                                Python elasticsearch.helpers.scan example
                            
                                How to test Django's UpdateView?
                            
                                install python package at current directory
                            
                                Convert WindowsPath to String
                            
                                How to solve CORS problem of my Django API?
                            
                                Tensorflow TFRecord: Can't parse serialized example
                            
                                In python, is there a setdefault() equivalent for getting object attributes?
                            
                                How to create an in-memory zip file with directories without touching the disk?
                            
                                Fast n-gram calculation
                            
                                Calling rm from subprocess using wildcards does not remove the files
                            
                                Dynamic type casting in python
                            
                                break and continue in function
                            
                                Controlling alpha value on 3D scatter plot using Python and matplotlib
                            
                                What value do I use in a slicing range to include the last value in a numpy array?
                            
                                python tornado get request url
                            
                                Python: Inheritance versus Composition
                            
                                Extending python with C: Pass a list to PyArg_ParseTuple
                            
                                How does one insert a key value pair into a python list?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

sys.stdin.readline() and input(): which one is faster when reading lines of input, and why?

Tags:

python

python-3.x

QzThrone

People also ask

2 Answers

Blckknght

Thomas Ahle

Recent Activity

Donate For Us