I am confused by the type conversion in python pandas
df = pd.DataFrame({'a':['1.23', '0.123']})
type(df['a'])
df['a'].astype(float)
Here df
is a pandas series and its contents are 2 strings, then I can apply astype(float)
on this pandas series, and it correctly convert all string into float. However
df['a'][1].astype(float)
gives me AttributeError: 'str' object has no attribute 'astype'. My question is: how can that be? I could convert the whole series from string to float but I couldn't convert the entry of this series from string to float?
Also, I load my raw data set
df['id'].astype(int)
it generates ValueError: invalid literal for int() with base 10: ''
This one seems to suggest that there is a blank in my df['id']
. So I check whether it is true by typing
'' in df['id']
it says false. So I am very confused.
The Python "AttributeError: 'str' object has no attribute" occurs when we try to access an attribute that doesn't exist on string objects. To solve the error, make sure the value is of the expected type before accessing the attribute. Here is an example of how the error occurs.
This error belongs to the AttributeError type. We encounter this error when trying to access an object’s unavailable attribute. For example, the NumPy arrays in Python have an attribute called size that returns the size of the array.
Attributes are functions or properties associated with an object of a class. Everything in Python is an object, and all these objects have a class with some attributes. We can access such properties using the . operator. This tutorial will discuss the object has no attribute python error in Python. This error belongs to the AttributeError type.
It's not valid, because you can't refer to the items by index with an iterator (title is a method on str objects) What does "AttributeError: 'str' object has no attribute 'title ()'" mean in Python 3?
df['a']
returns a Series
object that has astype
as a vectorized way to convert all elements in the series into another one.
df['a'][1]
returns the content of one cell of the dataframe, in this case the string '0.123'
. This is now returning a str
object that doesn't have this function. To convert it use regular python instruction:
type(df['a'][1])
Out[25]: str
float(df['a'][1])
Out[26]: 0.123
type(float(df['a'][1]))
Out[27]: float
As per your second question, the operator in
that is at the end calling __contains__
against the series with ''
as argument, here is the docstring of the operator:
help(pd.Series.__contains__)
Help on function __contains__ in module pandas.core.generic:
__contains__(self, key)
True if the key is in the info axis
It means that the in
operator is searching your empty string in the index, not the contents of it.
The way to search your empty strings is to use the equal operator:
df
Out[54]:
a
0 42
1
'' in df
Out[55]: False
df==''
Out[56]:
a
0 False
1 True
df[df['a']=='']
Out[57]:
a
1
df['a'][1]
will return the actual value inside the array, at the position 1
, which is in fact a string. You can convert it by using float(df['a'][1])
.
>>> df = pd.DataFrame({'a':['1.23', '0.123']})
>>> type(df['a'])
<class 'pandas.core.series.Series'>
>>> df['a'].astype(float)
0 1.230
1 0.123
Name: a, dtype: float64
>>> type(df['a'][1])
<type 'str'>
For the second question, maybe you have an empty value on your raw data. The correct test would be:
>>> df = pd.DataFrame({'a':['1', '']})
>>> '' in df['a'].values
True
Source for the second question: https://stackoverflow.com/a/21320011/5335508
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With