I got a fairly big dataframe from a csv in pandas.
The problem is that on some columns I get strings of text which I would like to isolate the last character to turn it into integers.
I found a solution, but I am fairly sure it's not the most efficient. It goes like this:
import pandas as pd
df = pd.read_csv("filename")
cols = list(df.loc[:, 'col_a':'column_s'])
df_filtered = df[cols].dropna()
df_filtered['col_o'] = df_filtered['col_o'].str[-1:]
df_filtered['col_p'] = df_filtered['col_p'].str[-1:]
df_filtered['col_q'] = df_filtered['col_q'].str[-1:]
df_filtered['col_r'] = df_filtered['col_r'].str[-1:]
df_filtered['col_s'] = df_filtered['col_s'].str[-1:]
In terms of writing, this is not really efficient. So I've tried something like this:
colstofilter = list(df_filtered.loc[:, 'col_o':'col_s'])
for col in df_filtered[colstofilter]:
print(df_filtered[col].str[-1:].head())
Printing it gives exactly what I want, but when I try to turn it into a function or a lamba or apply it to the dataframe, I get an error that it's not supported
Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value', '2nd old value', ...], ['1st new value', '2nd new value', ...])
You can replace a string in the pandas DataFrame column by using replace(), str. replace() with lambda functions.
replace() function is used to replace a string, regex, list, dictionary, series, number, etc. from a Pandas Dataframe in Python.
Try this:
df_filtered.loc[:, 'col_o':'col_s'] = \
df_filtered.loc[:, 'col_o':'col_s'].apply(lambda x: x.str[-1])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With