I want to get the number of rows each time a new one is created when I load a .csv
file into a dataframe :
def file_len(fname):
with open(fname) as f:
for i, l in enumerate(f):
pass
return i + 1
csv_path = "C:/...."
max_length = file_len(csv_path)
data = read_csv(csv_path, sep=';', encoding='utf-8')
With that code I get the max number of rows but I don't know how to get the number of rows in the dataframe, each time one is created. I wanted to use them to make a 0-100% progress bar
You can't do this - you would have to modify read_csv
function and maybe other functions in pandas.
EDIT:
It seems it can bo done now with chunksize=rows_number
.
Using only iterator=True
didn't work for me - or maybe it needed more rows.
Thanks to Jeff
Try this
import pandas as pd
from StringIO import StringIO
data = """A,B,C
foo,1,2,3
bar,4,5,6
baz,7,8,9
"""
reader = pd.read_csv(StringIO(data), chunksize=1)
for x in reader:
print x
print '--- next data ---'
result:
A B C
foo 1 2 3
--- next data ---
A B C
bar 4 5 6
--- next data ---
A B C
baz 7 8 9
--- next data ---
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With