In pandas v0.17.1 (anaconda python v3.4.3) the replace function on datetime
is broken.
I am trying to replace a string value in my DataFrame
with new value. This DataFrame
contains multiple columns (including a datatime column).
The replace function fails
>>> from datetime import datetime
>>> import pandas as pd
>>> df = pd.DataFrame({'no':range(4), 'nm':list('abcd'), 'tm':datetime.now()})
>>> df.replace('a', 'A')
Traceback (most recent call last): File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 2061, in _try_coerce_args other = other.astype('i8',copy=False).view('i8') ValueError: invalid literal for int() with base 10: 'a'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 594, in replace values, _, to_replace, _ = self._try_coerce_args(self.values, to_replace) File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 2066, in _try_coerce_args raise TypeError TypeError
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "", line 1, in File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/generic.py", line 3110, in replace inplace=inplace, regex=regex) File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 2870, in replace return self.apply('replace', **kwargs) File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 2823, in apply applied = getattr(b, f)(**kwargs) File "/home/xxx/anaconda/envs/py3/lib/python3.4/site-packages/pandas/core/internals.py", line 607, in replace if not mask.any(): UnboundLocalError: local variable 'mask' referenced before assignment
This same code is working on fine on pandas version 0.16.2.
Is this a confirmed bug?
As commented, this is fixed in master and will be included in 0.18 (coming soon in January 2016): https://github.com/pydata/pandas/issues/11868, and was present in 0.17.1 only.
As a workaround (assuming you have no duplicately named columns), the Series replace still works fine in 0.17.1:
for c in df.select_dtypes(include=["object"]).columns:
df[c] = df[c].replace('a', 'A')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With