Pandas iterrows change the type of columns. According to this github issue, it is an intended behavior.
Any idea of a pythonic and elegant way of casting it back to the original type? Note that I have multiple column types.
minimal example
df = pd.DataFrame([range(5), range(5)])
df.iloc[:,1] = df.iloc[:,1].astype('float')
for row in df.iterrows():
print row
Results with
(0, 0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
Name: 0, dtype: float64)
(1, 0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
Name: 1, dtype: float64)
Note that df.dtypes
returns the types of columns, however, I couldn't think of an elegant way of using it to cast the row back to that type.
Try using df.itertuples
instead:
df = pd.DataFrame([range(5), range(5)], columns=list('abcde'))
df.iloc[:,1] = df.iloc[:,1].astype('float')
for row in df.itertuples():
print(row)
Pandas(Index=0, a=0, b=1.0, c=2, d=3, e=4)
Pandas(Index=1, a=0, b=1.0, c=2, d=3, e=4)
You can do this iteration keeping the data type in the following way:
import pandas
df = pandas.DataFrame({'ints': list(range(5)), 'floats': [float(i) for i in range(5)]})
print(df)
for idx in df.index:
print(f'Integer number: {df.loc[idx,"ints"]}')
print(f'Float number: {df.loc[idx,"floats"]}')
The output (in Python 3.8.5) is
ints floats
0 0 0.0
1 1 1.0
2 2 2.0
3 3 3.0
4 4 4.0
Integer number: 0
Float number: 0.0
Integer number: 1
Float number: 1.0
Integer number: 2
Float number: 2.0
Integer number: 3
Float number: 3.0
Integer number: 4
Float number: 4.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With