The size property returns the number of elements in the DataFrame. The number of elements is the number of rows * the number of columns.
length method= len() => It's return number of element from value of variable. count method = count() =>It's return how many times appeared from value of variable which you are specified value.
Size and shape of a dataframe in pandas python: Size of a dataframe is the number of fields in the dataframe which is nothing but number of rows * number of columns. Shape of a dataframe gets the number of rows and number of columns of the dataframe.
size
includes NaN
values, count
does not:
In [46]:
df = pd.DataFrame({'a':[0,0,1,2,2,2], 'b':[1,2,3,4,np.NaN,4], 'c':np.random.randn(6)})
df
Out[46]:
a b c
0 0 1 1.067627
1 0 2 0.554691
2 1 3 0.458084
3 2 4 0.426635
4 2 NaN -2.238091
5 2 4 1.256943
In [48]:
print(df.groupby(['a'])['b'].count())
print(df.groupby(['a'])['b'].size())
a
0 2
1 1
2 2
Name: b, dtype: int64
a
0 2
1 1
2 3
dtype: int64
What is the difference between size and count in pandas?
The other answers have pointed out the difference, however, it is not completely accurate to say "size
counts NaNs while count
does not". While size
does indeed count NaNs, this is actually a consequence of the fact that size
returns the size (or the length) of the object it is called on. Naturally, this also includes rows/values which are NaN.
So, to summarize, size
returns the size of the Series/DataFrame1,
df = pd.DataFrame({'A': ['x', 'y', np.nan, 'z']})
df
A
0 x
1 y
2 NaN
3 z
<!- _>
df.A.size
# 4
...while count
counts the non-NaN values:
df.A.count()
# 3
Notice that size
is an attribute (gives the same result as len(df)
or len(df.A)
). count
is a function.
1. DataFrame.size
is also an attribute and returns the number of elements in the DataFrame (rows x columns).
GroupBy
- Output StructureBesides the basic difference, there's also the difference in the structure of the generated output when calling GroupBy.size()
vs GroupBy.count()
.
df = pd.DataFrame({
'A': list('aaabbccc'),
'B': ['x', 'x', np.nan, np.nan,
np.nan, np.nan, 'x', 'x']
})
df
A B
0 a x
1 a x
2 a NaN
3 b NaN
4 b NaN
5 c NaN
6 c x
7 c x
Consider,
df.groupby('A').size()
A
a 3
b 2
c 3
dtype: int64
Versus,
df.groupby('A').count()
B
A
a 2
b 0
c 2
GroupBy.count
returns a DataFrame when you call count
on all column, while GroupBy.size
returns a Series.
The reason being that size
is the same for all columns, so only a single result is returned. Meanwhile, the count
is called for each column, as the results would depend on on how many NaNs each column has.
pivot_table
Another example is how pivot_table
treats this data. Suppose we would like to compute the cross tabulation of
df
A B
0 0 1
1 0 1
2 1 2
3 0 2
4 0 0
pd.crosstab(df.A, df.B) # Result we expect, but with `pivot_table`.
B 0 1 2
A
0 1 2 1
1 0 0 1
With pivot_table
, you can issue size
:
df.pivot_table(index='A', columns='B', aggfunc='size', fill_value=0)
B 0 1 2
A
0 1 2 1
1 0 0 1
But count
does not work; an empty DataFrame is returned:
df.pivot_table(index='A', columns='B', aggfunc='count')
Empty DataFrame
Columns: []
Index: [0, 1]
I believe the reason for this is that 'count'
must be done on the series that is passed to the values
argument, and when nothing is passed, pandas decides to make no assumptions.
Just to add a little bit to @Edchum's answer, even if the data has no NA values, the result of count() is more verbose, using the example before:
grouped = df.groupby('a')
grouped.count()
Out[197]:
b c
a
0 2 2
1 1 1
2 2 3
grouped.size()
Out[198]:
a
0 2
1 1
2 3
dtype: int64
When we are dealing with normal dataframes then only difference will be an inclusion of NAN values, means count does not include NAN values while counting rows.
But if we are using these functions with the groupby
then, to get the correct results by count()
we have to associate any numeric field with the groupby
to get the exact number of groups where for size()
there is no need for this type of association.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With