I've got the following code and it works. This basically renames values in columns so that they can be later merged.
pop = pd.read_csv('population.csv')
pop_recent = pop[pop['Year'] == 2014]
mapping = {
'Korea, Rep.': 'South Korea',
'Taiwan, China': 'Taiwan'
}
f= lambda x: mapping.get(x, x)
pop_recent['Country Name'] = pop_recent['Country Name'].map(f)
Warning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy pop_recent['Country Name'] = pop_recent['Country Name'].map(f)
I did google this! But no examples seem to be using map, so I'm at a loss...
A value is trying to be set on a copy of a slice from a DataFrame. One approach that can be used to suppress SettingWithCopyWarning is to perform the chained operations into just a single loc operation. This will ensure that the assignment happens on the original DataFrame instead of a copy.
How do I stop deprecation warning in Python? Use warnings. filterwarnings() to ignore deprecation warnings Call warnings. filterwarnings(action, category=DeprecationWarning) with action as "ignore" and category set to DeprecationWarning to ignore any deprecation warnings that may rise.
How to Fix the KeyError? We can simply fix the error by correcting the spelling of the key. If we are not sure about the spelling we can simply print the list of all column names and crosscheck.
A SettingWithCopyWarning warns the user of a potential bug and should never be ignored even if the program runs as expected. The warning arises when a line of code both gets an item and sets an item. Pandas does not assure whether the get item returns a view or a copy of the dataframe.
The issue is with chained indexing , what you are actually trying to do is to set values to - pop[pop['Year'] == 2014]['Country Name']
- this would not work most of the times (as explained very well in the linked documentation) as this is two different calls and one of the calls may return a copy of the dataframe (I believe the boolean indexing) is returning the copy of the dataframe).
Hence, when you try to set values to that copy, it does not reflect in the original dataframe. Example -
In [6]: df
Out[6]:
A B
0 1 2
1 3 4
2 4 5
3 6 7
4 8 9
In [7]: df[df['A']==1]['B'] = 10
/path/to/ipython-script.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
if __name__ == '__main__':
In [8]: df
Out[8]:
A B
0 1 2
1 3 4
2 4 5
3 6 7
4 8 9
As noted , instead of chained indexing you should use DataFrame.loc
to index the rows as well as the columns to update in a single call, avoiding this error. Example -
pop.loc[(pop['year'] == 2014), 'Country Name'] = pop.loc[(pop['year'] == 2014), 'Country Name'].map(f)
Or if this seem too long to you, you can create a mask (boolean dataframe) beforehand and assign to a variable, and use that in the above statement. Example -
mask = pop['year'] == 2014
pop.loc[mask,'Country Name'] = pop.loc[mask,'Country Name'].map(f)
Demo -
In [9]: df
Out[9]:
A B
0 1 2
1 3 4
2 4 5
3 6 7
4 8 9
In [10]: mapping = { 1:2 , 3:4}
In [11]: f= lambda x: mapping.get(x, x)
In [12]: df.loc[(df['B']==2),'A'] = df.loc[(df['B']==2),'A'].map(f)
In [13]: df
Out[13]:
A B
0 2 2
1 3 4
2 4 5
3 6 7
4 8 9
Demo with the mask method -
In [18]: df
Out[18]:
A B
0 1 2
1 3 4
2 4 5
3 6 7
4 8 9
In [19]: mask = df['B']==2
In [20]: df.loc[mask,'A'] = df.loc[mask,'A'].map(f)
In [21]: df
Out[21]:
A B
0 2 2
1 3 4
2 4 5
3 6 7
4 8 9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With