How do I create a new dataframe based on row values of multiple columns in python?

Question

I have multiple columns that contained only 0s or 1s.

Apple	Orange	Pear
1	0	1
0	0	1
1	1	0

I would like to count and input the number of 0s (in "Wrong" column) and 1s (in "Correct" column) of each column in the new dataframe, and total them up into a table that looks like the following.

Fruit	Correct	Wrong
Apple	2	1
Orange	1	2
Pear	2	1

I tried a blend of value_counts(), groupby(), and pandas.pivot_table, but got stuck with the manipulation of the table.

mozway · Accepted Answer

Don't try to compute both Correct and Wrong simultaneously. One number is determined once you know the other.

Simply count (sum) the correct and post-process:

out = (df.sum()
   .rename_axis('Fruit')
   .reset_index(name='Correct')
   .assign(Wrong=lambda d: len(df)-d['Correct'])
 )

Output:

    Fruit  Correct  Wrong
0   Apple        2      1
1  Orange        1      2
2    Pear        2      1

How do I create a new dataframe based on row values of multiple columns in python?

Tags:

python

pandas

numpy

etl

Beavis

1 Answers

mozway

Recent Activity

Donate For Us

How do I create a new dataframe based on row values of multiple columns in python?

Tags:

python

pandas

numpy

etl

Beavis

1 Answers

mozway

Related questions

Recent Activity

Donate For Us