I have a dataframe that I have pivoted:
FinancialYear 2014/2015 2015/2016 2016/2017 2017/2018
Month
April 42 32 29 27
August 34 28 32 0
December 45 51 28 0
February 28 20 28 0
January 32 28 33 0
July 40 66 31 30
June 32 67 37 35
March 43 36 39 0
May 34 30 24 29
November 39 32 31 0
October 38 39 28 0
September 29 19 34 0
This is the code that I used:
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]
hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')
df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))
The Months are not in the order I want, so I used the following code to reindex it according to a list:
vals = ['April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December', 'January', 'February', 'March']
df_hm = df_hm.reindex(vals)
This worked, but the values in my table are now mostly showing NaN
values.
FinancialYear 2014/2015 2015/2016 2016/2017 2017/2018
Month
April nan nan nan nan
May nan nan nan nan
June nan nan nan nan
July nan nan nan nan
August nan nan nan nan
September 29 19 34 0
October nan nan nan nan
November nan nan nan nan
December nan nan nan nan
January nan nan nan nan
February nan nan nan nan
March nan nan nan nan
Any idea on what is happening? How to fix it? and if there is a better alternative method?
Method 1: Sort list of lists using sort() + lambda. The anonymous nature of Python Lambda Functions indicates that they lack a name. The Python sort() can be used to perform this variation of sort by passing a function. The list can be sorted using the sort function both ascendingly and descendingly.
To sort a Pandas DataFrame by index, you can use DataFrame. sort_index() method. To specify whether the method has to sort the DataFrame in ascending or descending order of index, you can set the named boolean argument ascending to True or False respectively. When the index is sorted, respective rows are rearranged.
Unexpected NaNs after reindexing are often due to the new index labels not exactly matching the old index labels. For example, if the original index labels contains whitespaces, but the new labels don't, then you'll get NaNs:
import numpy as np
import pandas as pd
df = pd.DataFrame({'col':[1,2,3]}, index=['April ', 'June ', 'May ', ])
print(df)
# col
# April 1
# June 2
# May 3
df2 = df.reindex(['April', 'May', 'June'])
print(df2)
# col
# April NaN
# May NaN
# June NaN
This can be fixed by removing the whitespace to make the labels match:
df.index = df.index.str.strip()
df3 = df.reindex(['April', 'May', 'June'])
print(df3)
# col
# April 1
# May 3
# June 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With