I am trying to count the duplicates of each type of row in my dataframe. For example, say that I have a dataframe in pandas as follows:
df = pd.DataFrame({'one': pd.Series([1., 1, 1]), 'two': pd.Series([1., 2., 1])})
I get a df that looks like this:
one two 0 1 1 1 1 2 2 1 1
I imagine the first step is to find all the different unique rows, which I do by:
df.drop_duplicates()
This gives me the following df:
one two 0 1 1 1 1 2
Now I want to take each row from the above df ([1 1] and [1 2]) and get a count of how many times each is in the initial df. My result would look something like this:
Row Count [1 1] 2 [1 2] 1
How should I go about doing this last step?
Edit:
Here's a larger example to make it more clear:
df = pd.DataFrame({'one': pd.Series([True, True, True, False]), 'two': pd.Series([True, False, False, True]), 'three': pd.Series([True, False, False, False])})
gives me:
one three two 0 True True True 1 True False False 2 True False False 3 False False True
I want a result that tells me:
Row Count [True True True] 1 [True False False] 2 [False False True] 1
You can count the number of duplicate rows by counting True in pandas. Series obtained with duplicated() . The number of True can be counted with sum() method. If you want to count the number of False (= the number of non-duplicate rows), you can invert it with negation ~ and then count True with sum() .
You can use groupby with function size. Then I reset index with rename column 0 to count .
Finding duplicate rows To take a look at the duplication in the DataFrame as a whole, just call the duplicated() method on the DataFrame. It outputs True if an entire row is identical to a previous row.
You can groupby
on all the columns and call size
the index indicates the duplicate values:
In [28]: df.groupby(df.columns.tolist(),as_index=False).size() Out[28]: one three two False False True 1 True False False 2 True True 1 dtype: int64
df.groupby(df.columns.tolist()).size().reset_index().\ rename(columns={0:'records'}) one two records 0 1 1 2 1 1 2 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With