Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python's function readlines(n) behavior

I've read the documentation, but what does readlines(n) do? By readlines(n), I mean readlines(3) or any other number.

When I run readlines(3), it returns same thing as readlines().

like image 788
user2013613 Avatar asked Jan 26 '13 19:01

user2013613


3 Answers

The optional argument should mean how many (approximately) bytes are read from the file. The file will be read further, until the current line ends:

readlines([size]) -> list of strings, each a line from the file.

Call readline() repeatedly and return a list of the lines so read.
The optional size argument, if given, is an approximate bound on the
total number of bytes in the lines returned.

Another quote:

If given an optional parameter sizehint, it reads that many bytes from the file and enough more to complete a line, and returns the lines from that.

You're right that it doesn't seem to do much for small files, which is interesting:

In [1]: open('hello').readlines()
Out[1]: ['Hello\n', 'there\n', '!\n']

In [2]: open('hello').readlines(2)
Out[2]: ['Hello\n', 'there\n', '!\n']

One might think it's explained by the following phrase in the documentation:

Read until EOF using readline() and return a list containing the lines thus read. If the optional sizehint argument is present, instead of reading up to EOF, whole lines totalling approximately sizehint bytes (possibly after rounding up to an internal buffer size) are read. Objects implementing a file-like interface may choose to ignore sizehint if it cannot be implemented, or cannot be implemented efficiently.

However, even when I try to read the file without buffering, it doesn't seem to change anything, which means some other kind of internal buffer is meant:

In [4]: open('hello', 'r', 0).readlines(2)
Out[4]: ['Hello\n', 'there\n', '!\n']

On my system, this internal buffer size seems to be around 5k bytes / 1.7k lines:

In [1]: len(open('hello', 'r', 0).readlines(5))
Out[1]: 1756

In [2]: len(open('hello', 'r', 0).readlines())
Out[2]: 28080
like image 179
Lev Levitsky Avatar answered Oct 11 '22 04:10

Lev Levitsky


It lists the lines, through which the given character size 'n' spans starting from the current line.

Ex: In a text file, with content of

one
two
three
four

open('text').readlines(0) returns ['one\n', 'two\n', 'three\n', 'four\n']

open('text').readlines(1) returns ['one\n']

open('text').readlines(3) returns ['one\n']

open('text').readlines(4) returns ['one\n', 'two\n']

open('text').readlines(7) returns ['one\n', 'two\n']

open('text').readlines(8) returns ['one\n', 'two\n', 'three\n']

open('text').readlines(100) returns ['one\n', 'two\n', 'three\n', 'four\n']

like image 44
Sundeep471 Avatar answered Oct 11 '22 06:10

Sundeep471


Depending on the size of the file, readlines(hint) should return a smaller set of lines. From the documentation:

f.readlines() returns a list containing all the lines of data in the file. 
If given an optional parameter sizehint, it reads that many bytes from the file 
and enough more to complete a line, and returns the lines from that. 
This is often used to allow efficient reading of a large file by lines, 
but without having to load the entire file in memory. Only complete lines 
will be returned.

So, if your file has 1000s of lines, you can pass in say... 65536, and it will only read up to that many bytes at a time + enough to complete the next line, returning all the lines that are completely read.

like image 1
billjamesdev Avatar answered Oct 11 '22 06:10

billjamesdev