Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Python, what is the difference between f.readlines() and list(f)

From both Python2 Tutorial and Python3 Tutorial, there is a line in the midpoint of section 7.2.1 saying:

If you want to read all the lines of a file in a list you can also use list(f) or f.readlines().

So my question is: What is the difference between these two ways to turn a file object to a list? I am curious both in performance aspect and in underneath Python object implementation (and maybe the difference between the Python2 and Python3).

like image 412
YaOzI Avatar asked May 30 '14 15:05

YaOzI


People also ask

What is the difference between readline () and Readlines () in Python?

What is Python readline()? Python readline() method will return a line from the file when called. readlines() method will return all the lines in a file in the format of a list where each element is a line in the file.

What does F Readlines do in Python?

The readlines() function in Python takes a text file as input and stores each line in the file as a separate element in a list.

What is difference between the read () and Readlines () methods?

The only difference between the Read() and ReadLine() is that Console. Read is used to read only single character from the standard output device, while Console. ReadLine is used to read a line or string from the standard output device. Program 1: Example of Console.

What are readline () and Readlines () function?

Python File readlines() MethodThe readlines() method returns a list containing each line in the file as a list item. Use the hint parameter to limit the number of lines returned. If the total number of bytes returned exceeds the specified number, no more lines are returned.


1 Answers

Functionally, there is no difference; both methods result in the exact same list.

Implementation wise, one uses the file object as an iterator (calls next(f) repeatedly until StopIteration is raised), the other uses a dedicated method to read the whole file.

Python 2 and 3 differ in what that means, exactly, unless you use io.open() in Python 2. Python 2 file objects use a hidden buffer for file iteration, which can trip you up if you mix file object iteration and .readline() or .readlines() calls.

The io library (which handles all file I/O in Python 3) does not use such a hidden buffer, all buffering is instead handled by a BufferedIOBase() wrapper class. In fact, the io.IOBase.readlines() implementation uses the file object as an iterator under the hood anyway, and TextIOWrapper iteration delegates to TextIOWrapper.readline(), so list(f) and f.readlines() essentially are the same thing, really.

Performance wise, there isn't really a difference even in Python 2, as the bottleneck is file I/O; how quickly can you read it from disk. At a micro level, performance can depend on other factors, such as if the OS has already buffered the data and how long the lines are.

like image 112
Martijn Pieters Avatar answered Sep 29 '22 01:09

Martijn Pieters