Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert str to float in pandas

I'm trying to convert a string of my dataset to a float type. Here some context:

import pandas as pd
import numpy as np
import xlrd
file_location = "/Users/sekr2/Desktop/Jari/Leistungen/leistungen2_2017.xlsx"
workbook = xlrd.open_workbook(file_location)
sheet = workbook.sheet_by_index(0)

df = pd.read_excel("/Users/.../bla.xlsx")

df.head()

    Leistungserbringer Anzahl Leistung     AL      TL      TaxW    Taxpunkte
 0  McGregor Sarah  12  'Konsilium'     147.28  87.47   KVG     234.75
 1  McGregor Sarah  12  'Grundberatung' 47.00   67.47   KVG     114.47
 2  McGregor Sarah  12  'Extra 5min'    87.28   87.47   KVG     174.75
 3  McGregor Sarah  12  'Respirator'    147.28  102.01  KVG     249.29
 4  McGregor Sarah  12  'Besuch'        167.28  87.45   KVG     254.73

To keep working on this I need to find a way to create a new column: df['Leistungswert'] = df['Taxpunkte'] * df['Anzahl'] * df['TaxW'].

TaxW shows the string 'KVG' for each entry. I know from the data that 'KVG' = 0.89. I have hit a wall with trying to convert the string into a float. I cannot just create a new column with the float type because this code should work with further inputs. In the column TaxW there are about 7 different entries with all different values.

I'm thankful for all information on this matter.

KVG = 0.92

like image 311
Jari Klingler Avatar asked Dec 24 '22 16:12

Jari Klingler


2 Answers

Assuming 'KVG' isn't the only possible string value in TaxW, you should store a mapping of strings to their float equivalent, like this:

map_ = {'KVG' : 0.89, ... } # add more fields here 

Then, you can use Series.map:

In [424]: df['Leistungswert'] = df['Taxpunkte'] * df['Anzahl'] * df['TaxW'].map(map_); df['Leistungswert']
Out[424]: 
0    2507.1300
1    1222.5396
2    1866.3300
3    2662.4172
4    2720.5164
Name: Leistungswert, dtype: float64

Alternatively, you can use df.transform:

In [435]: df['Leistungswert'] = df.transform(lambda x: x['Taxpunkte'] * x['Anzahl'] * map_[x['TaxW']], axis=1); df['Lei
     ...: stungswert']
Out[435]: 
0    2507.1300
1    1222.5396
2    1866.3300
3    2662.4172
4    2720.5164
Name: Leistungswert, dtype: float64
like image 58
cs95 Avatar answered Jan 04 '23 01:01

cs95


Alternative solution which uses map_ mapping from @COLDSPEED:

In [237]: df.assign(TaxW=df['TaxW'].map(map_)) \
            .eval("Leistungswert = Taxpunkte * Anzahl * TaxW", inplace=False)
Out[237]:
  Leistungserbringer  Anzahl       Leistung      AL      TL  TaxW  Taxpunkte  Leistungswert
0     McGregor Sarah      12      Konsilium  147.28   87.47  0.89     234.75      2507.1300
1     McGregor Sarah      12  Grundberatung   47.00   67.47  0.89     114.47      1222.5396
2     McGregor Sarah      12     Extra 5min   87.28   87.47  0.89     174.75      1866.3300
3     McGregor Sarah      12     Respirator  147.28  102.01  0.89     249.29      2662.4172
4     McGregor Sarah      12         Besuch  167.28   87.45  0.89     254.73      2720.5164
like image 20
MaxU - stop WAR against UA Avatar answered Jan 04 '23 01:01

MaxU - stop WAR against UA