In my code, df
is defined like this
df = pd.read_excel(io=file_name, sheet_name=sheet, sep='\s*,\s*')
I have a [86 rows x 1 columns]
dataframe df
which looks like this on print(df)
0
Male 511
Female 461
Male 273
Female 217
Male 394
Female 337
Female 337
Male 337
...
I wish to write a code that would merge
the Male
and Female
entries like this
0 1 2 3 ...
Male 511 273 394 337 ...
Female 461 217 337 337 ...
The final task I need to do is to .sum()
the male row and then the female row to get the total of each sex. I am new to python and pandas and I haven't been able to make much progress so far. Any help, tutorial, documentation would be great! Thank you!
Edit: By keys
I mean the indexes. I hope these labels of Male and Females can be used to 'club' these rows together, but I don't know how to.
Edit: I have accomplished my last task directly via
print(df.ix['Female'].sum())
print(df.ix['Male'].sum())
But I am yet to achieve my forst task. Any ideas?
Use pandas.DataFrame.sum() to sum the rows of a DataFrame Call pandas.DataFrame.sum(axis=1) to find the sum of all rows in DataFrame ; axis=1 specifies that the sum will be done on the rows. Specify the sum to be restricted to certain columns by making a list of the columns to be included in the sum.
To sum all the rows of a DataFrame, use the sum() function and set the axis value as 1. The value axis 1 will add the row values.
Create MultiIndex
by GroupBy.cumcount
for new columns names created by reshaping by unstack
:
df.index = [df.index, df.groupby(level=0).cumcount()]
print (df)
0
Male 0 511
Female 0 461
Male 1 273
Female 1 217
Male 2 394
Female 2 337
3 337
Male 3 337
df = df[0].unstack()
print (df)
0 1 2 3
Female 461 217 337 337
Male 511 273 394 337
And then sum
all rows by axis=1
:
print (df.sum(axis=1))
Female 1352
Male 1515
dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With