I made a function to clean up any HTML code/tags from strings in my dataframe. The function takes every value from the data frame, cleans it with the remove_html function, and returns a clean df. After converting the data frame to string values and cleaning it up I'm attempting to convert where possible the values in the data frame back to integers. I have tried try/except but don't get the result that I want. This is what I have at the moment:
def clean_df(df):
df = df.astype(str)
list_of_columns = list(df.columns)
for col in list_of_columns:
column = []
for row in list(df[col]):
column.append(remove_html(row))
try:
return int(row)
except ValueError:
pass
del df[col]
df[col] = column
return df
Without the try/except statements the function returns a clean df where the integers are strings. So its just the try/except statement that seems to be an issue. I've tried the try/except statements in multiple ways and none of them return a df. The current code for example returns an 'int' object.
insert the columm.append
into the try:
for col in list_of_columns:
column = []
for row in list(df[col]):
try:
column.append(remove_html(row))
except ValueError:
pass
del df[col]
df[col] = column
return df
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With