When I run the below code, it gives me an error saying that there is attribute error: 'float' object has no attribute 'split' in python.
I would like to know why this error comes about.
def text_processing(df):
"""""=== Lower case ==="""
'''First step is to transform comments into lower case'''
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
return df
df = text_processing(df)
The full traceback for the error:
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1664, in <module>
main()
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 53, in <module>
df = text_processing(df)
File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in text_processing
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
File "C:\Users\L31307\AppData\Roaming\Python\Python37\site-packages\pandas\core\series.py", line 3194, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/src\inference.pyx", line 1472, in pandas._libs.lib.map_infer
File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in <lambda>
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
AttributeError: 'float' object has no attribute 'split'
The Python "AttributeError: 'list' object has no attribute 'split'" occurs when we call the split() method on a list instead of a string. To solve the error, call split() on a string, e.g. by accessing the list at a specific index or by iterating over the list. Here is an example of how the error occurs.
Conclusion # The Python "TypeError: 'float' object is not iterable" occurs when we try to iterate over a float or pass a float to a built-in function like, list() or tuple() . To solve the error, use the range() built-in function to iterate over a range, e.g. for i in range(int(3.0)): .
The Python "TypeError: 'float' object is not subscriptable" occurs when we try to use square brackets to access a float at a specific index. To solve the error, convert the float to a string before accessing it at an index, e.g. str(float)[0] .
split() is a python method which is only applicable to strings. It seems that your column "content" not only contains strings but also other values like floats to which you cannot apply the .split() mehthod.
Try converting the values to a string by using str(x).split() or by converting the entire column to strings first, which would be more efficient. You do this as follows:
df['column_name'].astype(str)
The error points to this line:
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() \
if x not in stop_words))
split
is being used here as a method of Python's built-in str
class. Your error indicates one or more values in df['content']
is of type float
. This could be because there is a null value, i.e. NaN
, or a non-null float value.
One workaround, which will stringify floats, is to just apply str
on x
before using split
:
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in str(x).split() \
if x not in stop_words))
Alternatively, and possibly a better solution, be explicit and use a named function with a try
/ except
clause:
def converter(x):
try:
return ' '.join([x.lower() for x in str(x).split() if x not in stop_words])
except AttributeError:
return None # or some other value
df['content'] = df['content'].apply(converter)
Since pd.Series.apply
is just a loop with overhead, you may find a list comprehension or map
more efficient:
df['content'] = [converter(x) for x in df['content']]
df['content'] = list(map(converter, df['content']))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With