Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DataFrame of objects `astype(float)` behaviour different depending if lists or arrays

I'll preface this with the statement that I wouldn't do this in the first place and that I ran across this helping a friend.

Consider the data frame df

df = pd.DataFrame(pd.Series([[1.2]]))

df

       0
0  [1.2]

This is a data frame of objects where the objects are lists. In my friend's code, they had:

df.astype(float)

Which breaks as I had hoped

ValueError: setting an array element with a sequence.

However, if those values were numpy arrays instead:

df = pd.DataFrame(pd.Series([np.array([1.2])]))

df

       0
0  [1.2]

And I tried the same thing:

df.astype(float)

     0
0  1.2

It's happy enough to do something and convert my 1-length arrays to scalars. This feels very dirty!

If instead they were not 1-length arrays

df = pd.DataFrame(pd.Series([np.array([1.2, 1.3])]))

df

            0
0  [1.2, 1.3]

Then it breaks

ValueError: setting an array element with a sequence.

Question
Please tell me this is a bug and we can fix it. Or can someone explain why and in what world this makes sense?


Response to @root
You are right. Is this worth an issue? Do you expect/want this?

a = np.empty((1,), object)
a[0] = np.array([1.2])

a.astype(float)

array([ 1.2])

And

a = np.empty((1,), object)
a[0] = np.array([1.2, 1.3])

a.astype(float)
ValueError: setting an array element with a sequence.
like image 835
piRSquared Avatar asked Mar 07 '18 18:03

piRSquared


1 Answers

This is due to the unsafe default-value for the castingargument of astype. In the docs the argument casting is described as such:

"Controls what kind of data casting may occur. Defaults to ‘unsafe’ for backwards compatibility." (my emphasis)

Any of the other possible castings return a TypeError.

a = np.empty((1,), object)
a[0] = np.array([1.2])
a.astype(float, casting='same_kind')

Results in:

TypeError: Cannot cast array from dtype('O') to dtype('float64') according to the rule 'same_kind'

This is true for all castings except unsafe, namely: no, equiv, safe, and same_kind.

like image 166
vmg Avatar answered Oct 05 '22 14:10

vmg