I want to get the number of rows each time a new one is created when I load a .csv file into a dataframe :
def file_len(fname):
    with open(fname) as f:
        for i, l in enumerate(f):
            pass
    return i + 1
csv_path = "C:/...."
max_length = file_len(csv_path)
data = read_csv(csv_path, sep=';', encoding='utf-8')
With that code I get the max number of rows but I don't know how to get the number of rows in the dataframe, each time one is created. I wanted to use them to make a 0-100% progress bar
You can't do this - you would have to modify read_csv function and maybe other functions in pandas.
EDIT:
It seems it can bo done now with chunksize=rows_number.
Using only iterator=True didn't work for me - or maybe it needed more rows.
Thanks to Jeff
Try this
import pandas as pd
from StringIO import StringIO
data = """A,B,C
foo,1,2,3
bar,4,5,6
baz,7,8,9
"""
reader = pd.read_csv(StringIO(data), chunksize=1)
for x in reader:
    print x
    print '--- next data ---'
result:
     A  B  C
foo  1  2  3
--- next data ---
     A  B  C
bar  4  5  6
--- next data ---
     A  B  C
baz  7  8  9
--- next data ---
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With