Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas apply changing dtype

I'm trying to convert a pandas dataframe into a Series of tuples:

Example input:

df = pd.DataFrame([[1,2,3.0],[3,4,5.0]])

Desired Output:

0    (1, 2, 3.0)
1    (3, 4, 5.0)
dtype: object    

However pandas seems to coerce my integer columns to floats.

I tried

import pandas as pd

df = pd.DataFrame([[1,2,3.0],[3,4,5]])
print(df)
print(df.dtypes)
print(df.apply(tuple,axis=1,reduce=False).apply(str))

Actual output:

   0  1    2
0  1  2  3.0
1  3  4  5.0

0      int64
1      int64
2    float64
dtype: object

0    (1.0, 2.0, 3.0)
1    (3.0, 4.0, 5.0)
dtype: object

This question suggests using reduce=False but this doesn't change anything for me.

Could someone explain why pandas is coercing the datatype somewhere along the way?

like image 732
Sebastian Wozny Avatar asked Aug 29 '18 13:08

Sebastian Wozny


People also ask

How can I change data type in pandas?

to_numeric() This method is used to convert the data type of the column to the numerical one. As a result, the float64 or int64 will be returned as the new data type of the column based on the values in the column.

How do I change the Dtype of a pandas column?

The dtype specified can be a buil-in Python, numpy , or pandas dtype. Let's suppose we want to convert column A (which is currently a string of type object ) into a column holding integers. To do so, we simply need to call astype on the pandas DataFrame object and explicitly define the dtype we wish to cast the column.

How do I change Dtype in Python?

In order to change the dtype of the given array object, we will use numpy. astype() function. The function takes an argument which is the target data type. The function supports all the generic types and built-in types of data.

How do I change the datatype of multiple columns in pandas?

You can use df. astype() with a dictionary for the columns you want to change with the corresponding dtype. Save this answer.


1 Answers

pandas.DataFrame.itertuples

to avoid forcing your ints to floats

pd.Series([*df.itertuples(index=False)])

0    (1, 2, 3.0)
1    (3, 4, 5.0)
dtype: object

zip, map, splat... magic

pd.Series([*zip(*map(df.get, df))])

0    (1, 2, 3.0)
1    (3, 4, 5.0)
dtype: object
like image 65
piRSquared Avatar answered Sep 19 '22 01:09

piRSquared