df = dd.read_csv('csv',usecols=fields,skip_blank_lines=True)
len(df.iloc[0:5])
The above code raises
AttributeError: 'DataFrame' object has no attribute 'iloc'
tried ix loc but unable select rows based on index
Dask.dataframe does not support iloc. Generally it's quite hard to do access any particular row in a csv file without first reading it all into memory.
However if you only want a few of the rows at the top then I recommend using the .head() method
>>> df.head()
One workaround is to create the index as a column, i.e. df_index, in your csv file and use it like so:
selection = (df[ df['df_index'].isin( list_of_indexes ) ]).compute()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With