Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

iterate over certain columns in data frame

Hi I have data frame like this:

    Ticker  P/E     P/S     P/B    P/FCF    Dividend
No.                     
1   NTCT    457.32  3.03    1.44    26.04   -
2   GWRE    416.06  9.80    5.33    45.62   -
3   PEGA    129.02  4.41    9.85    285.10  0.28%
4   BLKB    87.68   4.96    14.36   41.81   0.62%

Firstly, I want to convert values in columns that contain numbers (which are string currently) to a float values. So here I would have the 4 middle columns that need the conversion to float. Would simple loop work with this case?

Second thing, there is a problem with the last column 'Dividend' where there is a percentage value as string. As a matter of fact I can convert it to decimals, however I was thinking if there is a way to still retaining the % and the values would still be calculable.

Any ideas for those two issues?

like image 445
Alex T Avatar asked Oct 17 '22 14:10

Alex T


2 Answers

plan

  • Take out 'Ticker' because it isn't numeric
  • use assign to overwrite Dividend by striping off %
  • use apply with pd.to_numeric to convert all the columns
  • use eval to get Dividend to proper decimal space


df[['Ticker']].join(
    df.assign(
        Dividend=df.Dividend.str.strip('%')
    ).drop('Ticker', 1).apply(
        pd.to_numeric, errors='coerce'
    )
).eval('Dividend = Dividend / 100', inplace=False)

    Ticker     P/E   P/S    P/B   P/FCF  Dividend
No.                                              
1     NTCT  457.32  3.03   1.44   26.04       NaN
2     GWRE  416.06  9.80   5.33   45.62       NaN
3     PEGA  129.02  4.41   9.85  285.10    0.0028
4     BLKB   87.68  4.96  14.36   41.81    0.0062

more lines
more readable

nums = df.drop('Ticker', 1).assign(Dividend=df.Dividend.str.strip('%'))
nums = nums.apply(pd.to_numeric, errors='coerce')
nums = nums.assign(Dividend=nums.Dividend / 100)
df[['Ticker']].join(nums)

    Ticker     P/E   P/S    P/B   P/FCF  Dividend
No.                                              
1     NTCT  457.32  3.03   1.44   26.04       NaN
2     GWRE  416.06  9.80   5.33   45.62       NaN
3     PEGA  129.02  4.41   9.85  285.10    0.0028
4     BLKB   87.68  4.96  14.36   41.81    0.0062
like image 182
piRSquared Avatar answered Oct 20 '22 09:10

piRSquared


Assuming that all P/... columns contain proper numbers:

In [47]: df.assign(Dividend=pd.to_numeric(df.Dividend.str.replace(r'\%',''), errors='coerce')
    ...:                      .div(100)) \
    ...:   .set_index('Ticker', append=True) \
    ...:   .astype('float') \
    ...:   .reset_index('Ticker')
    ...:
Out[47]:
    Ticker     P/E   P/S    P/B   P/FCF  Dividend
No.
1     NTCT  457.32  3.03   1.44   26.04       NaN
2     GWRE  416.06  9.80   5.33   45.62       NaN
3     PEGA  129.02  4.41   9.85  285.10    0.0028
4     BLKB   87.68  4.96  14.36   41.81    0.0062
like image 21
MaxU - stop WAR against UA Avatar answered Oct 20 '22 09:10

MaxU - stop WAR against UA