I am trying to write some code that splits a string in a dataframe column at comma (so it becomes a list) and removes a certain string from that list if it is present. after removing the unwanted string I want to join the list elements again at comma. My dataframe looks like this: <pre class="prettyprint"><code>df: Column1 Column2 0 a a,b,c 1 y b,n,m 2 d n,n,m 3 d b,b,x </code></pre> So basically my goal is to remove all b values from column2 so that I get: df: <pre class="prettyprint"><code> Column1 Column2 0 a a,c 1 y n,m 2 d n,n,m 3 d x </code></pre> The code I have written is the following: <pre class="prettyprint"><code>df=df['Column2'].apply(lambda x: x.split(',')) def exclude_b(df): for index, liste in df['column2].iteritems(): if 'b' in liste: liste.remove('b') return liste else: return liste </code></pre> The first row splits all the values in the column into a comma separated list. with the function now I tried to iterate through all the lists and remove the b if present, if it is not present return the list as it is. If I print 'liste' at the end it only returns the first row of Column2, but not the others. What am I doing wrong? And would there be a way to implement my if condition into a lambda function?

simply you can apply the regex <code>b,?</code> , which means replace any value of <code>b</code> and <code>,</code> found after the <code>b</code> if exists <pre class="prettyprint"><code>df['Column2'] = df.Column2.str.replace('b,?' , '') Out[238]: Column1 Column2 0 a a,c 1 y n,m 2 d n,n,m 3 d x </code></pre>

How to remove string value from column in pandas dataframe

Tags:

python

regex

pandas

dataframe

lambda

I am trying to write some code that splits a string in a dataframe column at comma (so it becomes a list) and removes a certain string from that list if it is present. after removing the unwanted string I want to join the list elements again at comma. My dataframe looks like this:

df:

   Column1  Column2
0      a       a,b,c
1      y       b,n,m
2      d       n,n,m
3      d       b,b,x

So basically my goal is to remove all b values from column2 so that I get:

df:

   Column1  Column2
0      a       a,c
1      y       n,m
2      d       n,n,m
3      d       x

The code I have written is the following:

df=df['Column2'].apply(lambda x: x.split(','))

def exclude_b(df):
    for index, liste in df['column2].iteritems():
        if 'b' in liste:
            liste.remove('b')
            return liste
        else:
            return liste

The first row splits all the values in the column into a comma separated list. with the function now I tried to iterate through all the lists and remove the b if present, if it is not present return the list as it is. If I print 'liste' at the end it only returns the first row of Column2, but not the others. What am I doing wrong? And would there be a way to implement my if condition into a lambda function?

969

asked Oct 29 '15 11:10

sequence_hard

1 Answers

simply you can apply the regex b,? , which means replace any value of b and , found after the b if exists

df['Column2'] = df.Column2.str.replace('b,?' , '')

Out[238]:
Column1 Column2
0   a   a,c
1   y   n,m
2   d   n,n,m
3   d   x

answered Sep 18 '22 09:09

Nader Hisham

Related questions
                            
                                Runtime of merging two lists in Python
                            
                                cx_Freeze help: is there a way to NOT make console open?
                            
                                Generate random sparse matrix filled with only values 0 or 1
                            
                                How to upload a text file using Python-Requests without writing to disk
                            
                                How to set custom output handlers for argparse in Python?
                            
                                SQLAlchemy "or" statement with multiple parameters
                            
                                Use of Hyphen or Minus Sign in Matplotlib versus Compatibility with Latex
                            
                                no such column: django_content_type.name
                            
                                Print a nested list line by line - Python
                            
                                How to set Send Buffer Size for sockets in python
                            
                                Size of sample in Random Forest Regression
                            
                                Pygame: ImportError: libSDL_ttf-2.0.so.0: cannot open shared object file: No such file or directory
                            
                                ArgumentParser: Optional argument with optional value
                            
                                Odoo development on Docker
                            
                                Why does pool run the entire file multiple times?
                            
                                Python PCA on Matrix too large to fit into memory
                            
                                Python ftplib: How to store results of `FTP.retrlines` in a list?
                            
                                How to embed URL link to QLabel
                            
                                How to split unicode strings character by character in python?
                            
                                Flask url_for ignoring port

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With