Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should I get the shape of a dask dataframe?

Tags:

Performing .shape is giving me the following error.

AttributeError: 'DataFrame' object has no attribute 'shape'

How should I get the shape instead?

like image 448
user1559897 Avatar asked May 15 '18 16:05

user1559897


People also ask

How do you display the shape of a data frame?

To get the shape of Pandas DataFrame, use DataFrame. shape. The shape property returns a tuple representing the dimensionality of the DataFrame. The format of shape would be (rows, columns).

How do I view a Dask DataFrame?

Start Dask Client for Dashboard It will provide a dashboard which is useful to gain insight on the computation. The link to the dashboard will become visible when you create the client below. We recommend having it open on one side of your screen while using your notebook on the other side.

What is a DataFrame shape?

The shape of a DataFrame is a tuple of array dimensions that tells the number of rows and columns of a given DataFrame. The DataFrame. shape attribute in Pandas enables us to obtain the shape of a DataFrame.

Is Dask faster than pandas?

The original pandas query took 182 seconds and the optimized Dask query took 19 seconds, which is about 10 times faster. Dask can provide performance boosts over pandas because it can execute common operations in parallel, where pandas is limited to a single core.


1 Answers

You can get the number of columns directly

len(df.columns)  # this is fast 

You can also call len on the dataframe itself, though beware that this will trigger a computation.

len(df)  # this requires a full scan of the data 

Dask.dataframe doesn't know how many records are in your data without first reading through all of it.

like image 97
MRocklin Avatar answered Oct 14 '22 16:10

MRocklin