With python's readlines()
function I can retrieve a list of each line in a file:
with open('dat.csv', 'r') as dat:
lines = dat.readlines()
I am working on a problem involving a very large file and this method is producing a memory error. Is there a pandas equivalent to Python's readlines()
function? The pd.read_csv()
option chunksize
seems to append numbers to my lines, which is far from ideal.
Minimal example:
In [1]: lines = []
In [2]: for df in pd.read_csv('s.csv', chunksize = 100):
...: lines.append(df)
In [3]: lines
Out[3]:
[ hello here is a line
0 here is another line
1 here is my last line]
In [4]: with open('s.csv', 'r') as dat:
...: lines = dat.readlines()
...:
In [5]: lines
Out[5]: ['hello here is a line\n', 'here is another line\n', 'here is my last line\n']
In [6]: cat s.csv
hello here is a line
here is another line
here is my last line
You should try to use the chunksize
option of pd.read_csv()
, as mentioned in some of the comments.
This will force pd.read_csv()
to read in a defined amount of lines at a time, instead of trying to read the entire file in one go. It would look like this:
>> df = pd.read_csv(filepath, chunksize=1, header=None, encoding='utf-8')
In the above example the file will be read line by line.
Now, in fact, according to the documentation of pandas.read_csv
, it is not a pandas.DataFrame
object that is being returned here, but a TextFileReader
object instead.
- chunksize : int, default None
Return TextFileReader object for iteration. See IO Tools docs for more information on iterator and chunksize.
Therefore, in order to complete the exercise, you would need to put this in a loop like this:
In [385]: cat data_sample.tsv
This is a new line
This is another line of text
And this is the last line of text in this file
In [386]: lines = []
In [387]: for line in pd.read_csv('./data_sample.tsv', encoding='utf-8', header=None, chunksize=1):
lines.append(line.iloc[0,0])
.....:
In [388]: print(lines)
['This is a new line', 'This is another line of text', 'And this is the last line of text in this file']
I hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With