Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas DataFrame.replace function broken for datetime

In pandas v0.17.1 (anaconda python v3.4.3) the replace function on datetime is broken.

I am trying to replace a string value in my DataFrame with new value. This DataFrame contains multiple columns (including a datatime column).
The replace function fails

>>> from datetime import datetime
>>> import pandas as pd
>>> df = pd.DataFrame({'no':range(4), 'nm':list('abcd'), 'tm':datetime.now()})
>>> df.replace('a', 'A')

Traceback (most recent call last): File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 2061, in _try_coerce_args other = other.astype('i8',copy=False).view('i8') ValueError: invalid literal for int() with base 10: 'a'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 594, in replace values, _, to_replace, _ = self._try_coerce_args(self.values, to_replace) File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 2066, in _try_coerce_args raise TypeError TypeError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/generic.py", line 3110, in replace inplace=inplace, regex=regex) File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 2870, in replace return self.apply('replace', **kwargs) File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 2823, in apply applied = getattr(b, f)(**kwargs) File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 607, in replace if not mask.any(): UnboundLocalError: local variable 'mask' referenced before assignment

This same code is working on fine on pandas version 0.16.2.
Is this a confirmed bug?

like image 939
shanmuga Avatar asked Oct 31 '22 13:10

shanmuga


1 Answers

As commented, this is fixed in master and will be included in 0.18 (coming soon in January 2016): https://github.com/pydata/pandas/issues/11868, and was present in 0.17.1 only.


As a workaround (assuming you have no duplicately named columns), the Series replace still works fine in 0.17.1:

for c in df.select_dtypes(include=["object"]).columns:
    df[c] = df[c].replace('a', 'A')
like image 84
Andy Hayden Avatar answered Nov 02 '22 22:11

Andy Hayden