Performing .shape is giving me the following error.
AttributeError: 'DataFrame' object has no attribute 'shape'
How should I get the shape instead?
To get the shape of Pandas DataFrame, use DataFrame. shape. The shape property returns a tuple representing the dimensionality of the DataFrame. The format of shape would be (rows, columns).
Start Dask Client for Dashboard It will provide a dashboard which is useful to gain insight on the computation. The link to the dashboard will become visible when you create the client below. We recommend having it open on one side of your screen while using your notebook on the other side.
The shape of a DataFrame is a tuple of array dimensions that tells the number of rows and columns of a given DataFrame. The DataFrame. shape attribute in Pandas enables us to obtain the shape of a DataFrame.
The original pandas query took 182 seconds and the optimized Dask query took 19 seconds, which is about 10 times faster. Dask can provide performance boosts over pandas because it can execute common operations in parallel, where pandas is limited to a single core.
You can get the number of columns directly
len(df.columns) # this is fast
You can also call len on the dataframe itself, though beware that this will trigger a computation.
len(df) # this requires a full scan of the data
Dask.dataframe doesn't know how many records are in your data without first reading through all of it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With