Pandas error "Can only use .str accessor with string values"

Question

I have the following input file:

"Name",97.7,0A,0A,65M,0A,100M,5M,75M,100M,90M,90M,99M,90M,0#,0N#,

And I am reading it in with:

#!/usr/bin/env python

import pandas as pd
import sys
import numpy as np

filename = sys.argv[1]
df = pd.read_csv(filename,header=None)
for col in df.columns[2:]:
    df[col] = df[col].str.extract(r'(\d+\.*\d*)').astype(np.float)

print df

However, I get the error

    df[col] = df[col].str.extract(r'(\d+\.*\d*)').astype(np.float)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 2241, in __getattr__
    return object.__getattribute__(self, name)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/base.py", line 188, in __get__
    return self.construct_accessor(instance)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/base.py", line 528, in _make_str_accessor
    raise AttributeError("Can only use .str accessor with string "
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

This worked OK in pandas 0.14 but does not work in pandas 0.17.0.

EdChum · Accepted Answer

It's happening because your last column is empty so this becomes converted to NaN:

In [417]:
t="""'Name',97.7,0A,0A,65M,0A,100M,5M,75M,100M,90M,90M,99M,90M,0#,0N#,"""
df = pd.read_csv(io.StringIO(t), header=None)
df

Out[417]:
       0     1   2   3    4   5     6   7    8     9    10   11   12   13  14  \
0  'Name'  97.7  0A  0A  65M  0A  100M  5M  75M  100M  90M  90M  99M  90M  0#   

    15  16  
0  0N# NaN

If you slice your range up to the last row then it works:

In [421]:
for col in df.columns[2:-1]:
    df[col] = df[col].str.extract(r'(\d+\.*\d*)').astype(np.float)
df

Out[421]:
       0     1   2   3   4   5    6   7   8    9   10  11  12  13  14  15  16
0  'Name'  97.7   0   0  65   0  100   5  75  100  90  90  99  90   0   0 NaN

Alternatively you can just select the cols that are object dtype and run the code (skipping the first col as this is the 'Name' entry):

In [428]:
for col in df.select_dtypes([np.object]).columns[1:]:
    df[col] = df[col].str.extract(r'(\d+\.*\d*)').astype(np.float)
df

Out[428]:
       0     1   2   3   4   5    6   7   8    9   10  11  12  13  14  15  16
0  'Name'  97.7   0   0  65   0  100   5  75  100  90  90  99  90   0   0 NaN

SPRBRN · Answer

I got this error while working in Eclipse. It turned out that the project interpreter was somehow (after an update I believe) reset to Python 2.7. Setting it back to Python 3.6 resolved this issue. It all resulted in several crashes, restarts and warnings. After several minutes of troubles it seems fixed now.

While I know this is not a solution to the problem posed here, I thought it might be useful for others, as I came to this page after searching for this error.

Pandas error "Can only use .str accessor with string values"

Tags:

python

string

casting

pandas

dataframe

graffe

2 Answers

EdChum

SPRBRN

Recent Activity

Donate For Us

Pandas error "Can only use .str accessor with string values"

Tags:

python

string

casting

pandas

dataframe

graffe

2 Answers

EdChum

SPRBRN

Related questions

Recent Activity

Donate For Us