Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas convert types and set invalid values as na

Tags:

python

pandas

Is it possible to convert pandas series values to a specific type and set those elements n/a that cannot be converted?

I found Series.astype(dtype, copy=True, raise_on_error=True) with and set raise_on_error=True to avoid exceptions, but this won't set invalid items to na...

Update

More precisely, I want to specify the type a column should be converted to. For a series containing values [123, 'abc', '2010-01-01', 1.3] and a type conversion to float, I'd expect [123.0, nan, nan, 1.3] as result, if datetime is chosen, only the series[2] would contain a valid datetime value. convert_objects doesn't allow this kind of flexibility, IMHO.

like image 867
orange Avatar asked Sep 05 '14 06:09

orange


1 Answers

I think you may have better luck with convert_objects:

In [11]: s = pd.Series(['1', '2', 'a'])

In [12]: s.astype(int, raise_on_error=False)  # just returns s
Out[12]:
0    1
1    2
2    a
dtype: object

In [13]: s.convert_objects(convert_numeric=True)
Out[13]:
0     1
1     2
2   NaN
dtype: float64

Update: In more recent pandas the convert_objects method has been deprecated.
In favor of pd.to_numeric:

In [21]: pd.to_numeric(s, errors='coerce')
Out[21]:
0    1.0
1    2.0
2    NaN
dtype: float64

This isn't quite as powerful/magical as convert_objects (which also worked on DataFrames) but works well and is more explicit in this case.
Read the object conversion section of the docs, where other to_* functions are mentioned.

like image 75
Andy Hayden Avatar answered Oct 18 '22 19:10

Andy Hayden