I want to replace a column of values in a DataFrame with a more accurate/complete set of values generated by a look-up table in the form of a Series that I have prepared.
I thought I could do it this way but the result is not as expected.
Here is the DataFrame I want to fix:
In [6]: df_normalised.head(10)
Out[6]:
code name
0 8 Human development
1 11
2 1 Economic management
3 6 Social protection and risk management
4 5 Trade and integration
5 2 Public sector governance
6 11 Environment and natural resources management
7 6 Social protection and risk management
8 7 Social dev/gender/inclusion
9 7 Social dev/gender/inclusion
(Note the missing name in row 2).
Here is the look-up table I created to do the fixing:
In [20]: names
Out[20]:
1 Economic management
10 Rural development
11 Environment and natural resources management
2 Public sector governance
3 Rule of law
4 Financial and private sector development
5 Trade and integration
6 Social protection and risk management
7 Social dev/gender/inclusion
8 Human development
9 Urban development
dtype: object
Here is the way I thought could do it:
In [21]: names[df_normalised.head(10).code]
Out[21]:
code
8 Human development
11 Environment and natural resources management
1 Economic management
6 Social protection and risk management
5 Trade and integration
2 Public sector governance
11 Environment and natural resources management
6 Social protection and risk management
7 Social dev/gender/inclusion
7 Social dev/gender/inclusion
dtype: object
However, I expected the resulting series above to have the same index as the index of df_normalised (i.e. 0, 1, 2, 3) not an index based on the code values.
So I'm not sure how to replace the original values in the 'name' column in df_normalised with these series values because the indexes are not the same.
Incidentally, how is it possible to have an index with duplicate values as above?
You can replace all values or selected values in a column of pandas DataFrame based on condition by using DataFrame. loc[] , np. where() and DataFrame. mask() methods.
Pandas replace multiple values in column replace. By using DataFrame. replace() method we will replace multiple values with multiple new strings or text for an individual DataFrame column. This method searches the entire Pandas DataFrame and replaces every specified value.
you can use map() function for that:
In [38]: df_normalised['name'] = df_normalised['code'].map(name)
In [39]: df_normalised
Out[39]:
code name
0 8 Human development
1 11 Environment and natural resources management
2 1 Economic management
3 6 Social protection and risk management
4 5 Trade and integration
5 2 Public sector governance
6 11 Environment and natural resources management
7 6 Social protection and risk management
8 7 Social dev/gender/inclusion
9 7 Social dev/gender/inclusion
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With