I have a very large dataset were I want to replace strings with numbers. I would like to operate on the dataset without typing a mapping function for each key (column) in the dataset. (similar to the fillna method, but replace specific string with assosiated value). Is there anyway to do this?
Here is an example of my dataset
data resp A B C 0 1 poor poor good 1 2 good poor good 2 3 very good very good very good 3 4 bad poor bad 4 5 very bad very bad very bad 5 6 poor good very bad 6 7 good good good 7 8 very good very good very good 8 9 bad bad very bad 9 10 very bad very bad very bad
The desired result:
data resp A B C 0 1 3 3 4 1 2 4 3 4 2 3 5 5 5 3 4 2 3 2 4 5 1 1 1 5 6 3 4 1 6 7 4 4 4 7 8 5 5 5 8 9 2 2 1 9 10 1 1 1
very bad=1, bad=2, poor=3, good=4, very good=5
//Jonas
The only difference with the method you've highlighted is that df. replace({'\n': '<br>'}, regex=True) returns a new DataFrame object instead of updating the columns on the original DataFrame. So you'll need to reassign the output, e.g. df = df. replace({'\n': '<br>'}, regex=True) .
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Pandas DataFrame update() Method The update() method updates a DataFrame with elements from another similar object (like another DataFrame). Note: this method does NOT return a new DataFrame. The updating is done to the original DataFrame.
Here are two ways to replace characters in strings in Pandas DataFrame: (1) Replace character/s under a single DataFrame column: df['column name'] = df['column name'].str.replace('old character','new character') ... Replace a Specific Character under a Single DataFrame Column.
Depending on your needs, you may use either of the following methods to replace values in Pandas DataFrame: (1) Replace a single value with a new value for an individual DataFrame column: df ['column name'] = df ['column name'].replace ( ['old value'],'new value')
The .replace () method is extremely powerful and lets you replace values across a single column, multiple columns, and an entire dataframe. The method also incorporates regular expressions to make complex replacements easier.
Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df ['column name'] = df ['column name'].replace ( ['1st old value','2nd old value',...], ['1st new value','2nd new value',...])
Use replace
In [126]: df.replace(['very bad', 'bad', 'poor', 'good', 'very good'], [1, 2, 3, 4, 5]) Out[126]: resp A B C 0 1 3 3 4 1 2 4 3 4 2 3 5 5 5 3 4 2 3 2 4 5 1 1 1 5 6 3 4 1 6 7 4 4 4 7 8 5 5 5 8 9 2 2 1 9 10 1 1 1
Considering data
is your pandas DataFrame
you can also use:
data.replace({'very bad': 1, 'bad': 2, 'poor': 3, 'good': 4, 'very good': 5}, inplace=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With