Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas apply unidecode to several columns

I am trying to convert all the elements of two pandas series from a pandas data frame, which aren't ascii characters to ascii. Simply apply the function to the relevant columns doesnt work. Python only shows an attribute error stating that 'series' object has no attribute encode.

import pandas as pd 
import numpy as np
from unidecode import unidecode

try_data=pd.DataFrame({ 

 'Units': np.array([3,4,5,6,10],dtype='int32'),
 'Description_PD': pd.Categorical(['VEIJA 5 TRIÂNGULOS 200','QUEIJO BOLA','QJ BOLA GRD','VEIJO A VACA TRIÂNGULOS 100','HEITE GORDO TERRA']), 
 'Description_Externa' : pd.Categorical(['SQP 4 porções', 'Bola', ' SIESTA BOLA', 'SQP 16 porções', 'TERRA NOSTRA'])

     })

  try_data[['Description_PD','Description_Externa']].apply(unidecode)
like image 850
S.K. Avatar asked Jun 14 '17 08:06

S.K.


1 Answers

Iterate over the col list and in the loop call apply, for some reason your attempt didn't work but it should have:

In[47]:
for col in ['Description_PD','Description_Externa']:
    try_data[col] = try_data[col].apply(unidecode)
try_data

Out[47]: 
  Description_Externa               Description_PD  Units
0       SQP 4 porcoes       VEIJA 5 TRIANGULOS 200      3
1                Bola                  QUEIJO BOLA      4
2         SIESTA BOLA                  QJ BOLA GRD      5
3      SQP 16 porcoes  VEIJO A VACA TRIANGULOS 100      6
4        TERRA NOSTRA            HEITE GORDO TERRA     10

For instance calling apply on a single column works fine:

In[49]:
try_data['Description_Externa'].apply(unidecode)

Out[49]: 
0     SQP 4 porcoes
1              Bola
2       SIESTA BOLA
3    SQP 16 porcoes
4      TERRA NOSTRA
Name: Description_Externa, dtype: category
Categories (5, object): [SIESTA BOLA, Bola, SQP 16 porcoes, SQP 4 porcoes, TERRA NOSTRA]
like image 155
EdChum Avatar answered Nov 01 '22 07:11

EdChum