Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is split('\n') method in Python implemented?

Tags:

python

split

This is a theoretical question to be able to understand a difference between Java and Python. To read the content of a file into an array in Java, you need to know the number of lines, in order to define the size of the array when you declare it. And because you cannot know it in advance, you need to apply some tricks to overcome that problem.

In Python though, lists can be of any size, so reading the content of a file into a list can be either done as:

lines = open('filename').read().split('\n')

or

lines = open('filename').readlines()

How does split('\n') work in this occasion? Is Python implementation performing some kind of tricks underneath as well (like doubling the size of an array when needed, etc.)?

Any information in shedding light into this would be much appreciated.

like image 898
Lejlek Avatar asked Dec 28 '22 22:12

Lejlek


1 Answers

The implementation of str.split() internally calls list.append(), which in turn calls the internal function list_resize(). From a comment in the source code of this function:

This over-allocates proportional to the list size, making room for additional growth. The over-allocation is mild, but is enough to give linear-time amortized behavior over a long sequence of appends() in the presence of a poorly-performing system realloc().

The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...

like image 200
Sven Marnach Avatar answered Jan 06 '23 16:01

Sven Marnach