How do I change the special characters to the usual alphabet letters? This is my dataframe:
In [56]: cities
Out[56]:
Table Code Country Year City Value
240 Åland Islands 2014.0 MARIEHAMN 11437.0 1
240 Åland Islands 2010.0 MARIEHAMN 5829.5 1
240 Albania 2011.0 Durrës 113249.0
240 Albania 2011.0 TIRANA 418495.0
240 Albania 2011.0 Durrës 56511.0
I want it to look like this:
In [56]: cities
Out[56]:
Table Code Country Year City Value
240 Aland Islands 2014.0 MARIEHAMN 11437.0 1
240 Aland Islands 2010.0 MARIEHAMN 5829.5 1
240 Albania 2011.0 Durres 113249.0
240 Albania 2011.0 TIRANA 418495.0
240 Albania 2011.0 Durres 56511.0
Add df = df. astype(float) after the replace and you've got it. I'd skip inplace and just do df = df. replace('\*', '', regex=True).
To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.
To strip whitespaces from column names, you can use str. strip, str. lstrip and str. rstrip.
The pandas method is to use the vectorised str.normalize
combined with str.decode
and str.encode
:
In [60]:
df['Country'].str.normalize('NFKD').str.encode('ascii', errors='ignore').str.decode('utf-8')
Out[60]:
0 Aland Islands
1 Aland Islands
2 Albania
3 Albania
4 Albania
Name: Country, dtype: object
So to do this for all str
dtypes:
In [64]:
cols = df.select_dtypes(include=[np.object]).columns
df[cols] = df[cols].apply(lambda x: x.str.normalize('NFKD').str.encode('ascii', errors='ignore').str.decode('utf-8'))
df
Out[64]:
Table Code Country Year City Value
0 240 Aland Islands 2014.0 MARIEHAMN 11437.0 1
1 240 Aland Islands 2010.0 MARIEHAMN 5829.5 1
2 240 Albania 2011.0 Durres 113249.0
3 240 Albania 2011.0 TIRANA 418495.0
4 240 Albania 2011.0 Durres 56511.0
With pandas series example
def remove_accents(a):
return unidecode.unidecode(a.decode('utf-8'))
df['column'] = df['column'].apply(remove_accents)
in this case decode asciis
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With