Sort Index by list - Python Pandas

Tags:

pandas

I have a dataframe that I have pivoted:

FinancialYear   2014/2015   2015/2016   2016/2017   2017/2018
Month               
April             42           32          29          27
August            34           28          32           0
December          45           51          28           0
February          28           20          28           0
January           32           28          33           0
July              40           66          31          30
June              32           67          37          35
March             43           36          39           0
May               34           30          24          29
November          39           32          31           0
October           38           39          28           0
September         29           19          34           0

This is the code that I used:

new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]

hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')

df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))

The Months are not in the order I want, so I used the following code to reindex it according to a list:

vals = ['April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December', 'January', 'February', 'March']

df_hm = df_hm.reindex(vals)

This worked, but the values in my table are now mostly showing NaN values.

FinancialYear   2014/2015   2015/2016   2016/2017   2017/2018
Month               
April              nan          nan         nan         nan
May                nan          nan         nan         nan
June               nan          nan         nan         nan
July               nan          nan         nan         nan
August             nan          nan         nan         nan
September           29           19          34           0
October            nan          nan         nan         nan
November           nan          nan         nan         nan
December           nan          nan         nan         nan
January            nan          nan         nan         nan
February           nan          nan         nan         nan
March              nan          nan         nan         nan

Any idea on what is happening? How to fix it? and if there is a better alternative method?

343

asked Jul 29 '17 12:07

1 Answers

Unexpected NaNs after reindexing are often due to the new index labels not exactly matching the old index labels. For example, if the original index labels contains whitespaces, but the new labels don't, then you'll get NaNs:

import numpy as np
import pandas as pd

df = pd.DataFrame({'col':[1,2,3]}, index=['April ', 'June ', 'May ', ])
print(df)
#         col
# April     1
# June      2
# May       3

df2 = df.reindex(['April', 'May', 'June'])
print(df2)
#        col
# April  NaN
# May    NaN
# June   NaN

This can be fixed by removing the whitespace to make the labels match:

df.index = df.index.str.strip()
df3 = df.reindex(['April', 'May', 'June'])
print(df3)
#        col
# April    1
# May      3
# June     2

170

answered Sep 29 '22 04:09

unutbu

Related questions
                            
                                Python iterate over stdin line by line using input()
                            
                                Bulk update with Python's elasticsearch client
                            
                                Using a .bat to change directories and run Jupyter
                            
                                Multi POST query (session mode)
                            
                                Run same python code in two terminals, will them interfere each other?
                            
                                How can I specify a python version using setuptools? [duplicate]
                            
                                How can I extract address from raw text using NLTK in python?
                            
                                How to add k-means predicted clusters in a column to a dataframe in Python
                            
                                How to nest numba jitclass
                            
                                Big-O complexity of random.choice(list) in Python3
                            
                                Modulo of negative integers in Go
                            
                                How to get the JobID for the airflow dag runs?
                            
                                How to get source code of function that is wrapped by a decorator?
                            
                                Django template keyword `choice_value` in no longer work in 1.11
                            
                                Django: python manage.py migrate does nothing at all
                            
                                How to calculate the trendline for stock price
                            
                                python3 -Get result from async method
                            
                                How to edit markdown cell in jupyter-notebook ( Could not edit markdown cell in Jupyter notebook)
                            
                                List of Differentiable Ops in Tensorflow
                            
                                R's browser() equivalent in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sort Index by list - Python Pandas

Tags:

python

pandas

ScoutEU

People also ask

1 Answers

unutbu

Recent Activity

Donate For Us